ACM Home Page
Please provide us with feedback. Feedback
A probabilistic approach to spatiotemporal theme pattern mining on weblogs
Full text PdfPdf (477 KB)
Source International World Wide Web Conference archive
Proceedings of the 15th international conference on World Wide Web table of contents
Edinburgh, Scotland
SESSION: Data mining table of contents
Pages: 533 - 542  
Year of Publication: 2006
ISBN:1-59593-323-9
Authors
Qiaozhu Mei  University of Illinois at Urbana Champaign, Urbana, IL
Chao Liu  University of Illinois at Urbana Champaign, Urbana, IL
Hang Su  Vanderbilt University, Nashville, TN
ChengXiang Zhai  University of Illinois at Urbana Champaign, Urbana, IL
Sponsors
SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 29,   Downloads (12 Months): 246,   Citation Count: 16
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
Save this Article to a Binder    Display Formats: BibTex  EndNote ACM Ref   
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1135777.1135857
What is a DOI?

ABSTRACT

Mining subtopics from weblogs and analyzing their spatiotemporal patterns have applications in multiple domains. In this paper, we define the novel problem of mining spatiotemporal theme patterns from weblogs and propose a novel probabilistic approach to model the subtopic themes and spatiotemporal theme patterns simultaneously. The proposed model discovers spatiotemporal theme patterns by (1) extracting common themes from weblogs; (2) generating theme life cycles for each given location; and (3) generating theme snapshots for each given time period. Evolution of patterns can be discovered by comparative analysis of theme life cycles and theme snapshots. Experiments on three different data sets show that the proposed approach can discover interesting spatiotemporal theme patterns effectively. The proposed probabilistic model is general and can be used for spatiotemporal text mining on any domain with time and location information.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
2
 
3
 
4
A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the EM algorithm. Journal of Royal Statist. Soc. B, 39:1--38, 1977.
5
6
 
7
K. E. Gill. Blogging, rss and the information landscape: A look at online news. In WWW 2005 Workshop on the Weblogging Ecosystem, 2005.
 
8
N. Glance, M. Hurst, and T. Tornkiyo. Blogpulse: Automated trend discovery for weblogs. In WWW 2004 Workshop on the Weblogging Ecosystem: Aggregation, Analysis and Dynamics, 2004.
 
9
T. L. Gri'ths and M. Steyvers. Fiding scientific topics. Proceedings of the National Academy of Sciences, 101(suppl.1):5228--5235, 2004.
10
11
12
13
 
14
A. Kontostathis, L. Galitsky, W. M. Pottenger, S. Roy, and D. J. Phelps. A survey of emerging trend detection in textual data mining. Survey of Text Mining, pages 185--224, 2003.
15
16
17
18
19
20
21
22
23
 
24
 
25
 
26
B. Tseng, J. Tatemura, and Y. Wu. Tomographic clustering to visualize blog communities as mountain views. In WWW 2005 Workshop on the Weblogging Ecosystem, 2005.
27

CITED BY  16
 

Collaborative Colleagues:
Qiaozhu Mei: colleagues
Chao Liu: colleagues
Hang Su: colleagues
ChengXiang Zhai: colleagues