| A probabilistic approach to spatiotemporal theme pattern mining on weblogs |
| Full text |
Pdf
(477 KB)
|
| Source
|
International World Wide Web Conference
archive
Proceedings of the 15th international conference on World Wide Web
table of contents
Edinburgh, Scotland
SESSION: Data mining
table of contents
Pages: 533 - 542
Year of Publication: 2006
ISBN:1-59593-323-9
|
|
Authors
|
|
Qiaozhu Mei
|
University of Illinois at Urbana Champaign, Urbana, IL
|
|
Chao Liu
|
University of Illinois at Urbana Champaign, Urbana, IL
|
|
Hang Su
|
Vanderbilt University, Nashville, TN
|
|
ChengXiang Zhai
|
University of Illinois at Urbana Champaign, Urbana, IL
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 29, Downloads (12 Months): 246, Citation Count: 16
|
|
|
ABSTRACT
Mining subtopics from weblogs and analyzing their spatiotemporal patterns have applications in multiple domains. In this paper, we define the novel problem of mining spatiotemporal theme patterns from weblogs and propose a novel probabilistic approach to model the subtopic themes and spatiotemporal theme patterns simultaneously. The proposed model discovers spatiotemporal theme patterns by (1) extracting common themes from weblogs; (2) generating theme life cycles for each given location; and (3) generating theme snapshots for each given time period. Evolution of patterns can be discovered by comparative analysis of theme life cycles and theme snapshots. Experiments on three different data sets show that the proposed approach can discover interesting spatiotemporal theme patterns effectively. The proposed probabilistic model is general and can be used for spatiotemporal text mining on any domain with time and location information.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
 |
2
|
|
| |
3
|
|
| |
4
|
A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the EM algorithm. Journal of Royal Statist. Soc. B, 39:1--38, 1977.
|
 |
5
|
|
 |
6
|
|
| |
7
|
K. E. Gill. Blogging, rss and the information landscape: A look at online news. In WWW 2005 Workshop on the Weblogging Ecosystem, 2005.
|
| |
8
|
N. Glance, M. Hurst, and T. Tornkiyo. Blogpulse: Automated trend discovery for weblogs. In WWW 2004 Workshop on the Weblogging Ecosystem: Aggregation, Analysis and Dynamics, 2004.
|
| |
9
|
T. L. Gri'ths and M. Steyvers. Fiding scientific topics. Proceedings of the National Academy of Sciences, 101(suppl.1):5228--5235, 2004.
|
 |
10
|
Daniel Gruhl , R. Guha , Ravi Kumar , Jasmine Novak , Andrew Tomkins, The predictive power of online chatter, Proceeding of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, August 21-24, 2005, Chicago, Illinois, USA
[doi> 10.1145/1081870.1081883]
|
 |
11
|
Daniel Gruhl , R. Guha , David Liben-Nowell , Andrew Tomkins, Information diffusion through blogspace, Proceedings of the 13th international conference on World Wide Web, May 17-20, 2004, New York, NY, USA
[doi> 10.1145/988672.988739]
|
 |
12
|
|
 |
13
|
|
| |
14
|
A. Kontostathis, L. Galitsky, W. M. Pottenger, S. Roy, and D. J. Phelps. A survey of emerging trend detection in textual data mining. Survey of Text Mining, pages 185--224, 2003.
|
 |
15
|
|
 |
16
|
|
 |
17
|
|
 |
18
|
|
 |
19
|
|
 |
20
|
Nikos Mamoulis , Huiping Cao , George Kollios , Marios Hadjieleftheriou , Yufei Tao , David W. Cheung, Mining, indexing, and querying historical spatiotemporal data, Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, August 22-25, 2004, Seattle, WA, USA
[doi> 10.1145/1014052.1014080]
|
 |
21
|
|
 |
22
|
|
 |
23
|
Daniel B. Neill , Andrew W. Moore , Maheshkumar Sabhnani , Kenny Daniel, Detection of emerging space-time clusters, Proceeding of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, August 21-24, 2005, Chicago, Illinois, USA
[doi> 10.1145/1081870.1081897]
|
| |
24
|
|
| |
25
|
|
| |
26
|
B. Tseng, J. Tatemura, and Y. Wu. Tomographic clustering to visualize blog communities as mountain views. In WWW 2005 Workshop on the Weblogging Ecosystem, 2005.
|
 |
27
|
|
CITED BY 16
|
|
|
|
|
Yih-Farn Robin Chen , Giuseppe Di Fabbrizio , David Gibbon , Serban Jora , Bernard Renger , Bin Wei, Geotracker: geospatial and temporal RSS navigation, Proceedings of the 16th international conference on World Wide Web, May 08-12, 2007, Banff, Alberta, Canada
|
|
|
|
|
|
|
|
|
|
|
Xuanhui Wang , ChengXiang Zhai , Xiao Hu , Richard Sproat, Mining correlated bursty topic patterns from coordinated text streams, Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, August 12-15, 2007, San Jose, California, USA
|
|
Qiaozhu Mei , Xu Ling , Matthew Wondra , Hang Su , ChengXiang Zhai, Topic sentiment mixture: modeling facets and opinions in weblogs, Proceedings of the 16th international conference on World Wide Web, May 08-12, 2007, Banff, Alberta, Canada
|
|
|
|
|
|
|
|
|
|
Yun Chi , Shenghuo Zhu , Xiaodan Song , Junichi Tatemura , Belle L. Tseng, Structural and temporal analysis of the blogosphere through community factorization, Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, August 12-15, 2007, San Jose, California, USA
|
|
Xiaochuan Ni , Gui-Rong Xue , Xiao Ling , Yong Yu , Qiang Yang, Exploring in the weblog space by detecting informative and affective articles, Proceedings of the 16th international conference on World Wide Web, May 08-12, 2007, Banff, Alberta, Canada
|
|
|
|