ABSTRACT
We propose a generative model based on latent Dirichlet allocation for mining distinct topics in document collections by integrating the temporal ordering of documents into the generative process. The document collection is divided into time segments where the discovered topics in each segment is propagated to influence the topic discovery in the subsequent time segments. We conduct experiments on the collection of academic papers from CiteSeer repository. We augment the text corpus with the addition of user queries and tags and integrate the citation graph to boost the weight of the topical terms. The experiment results show that segmented topic model can effectively detect distinct topics and their evolution over time.
- G. Almpanidis, C. Kotropoulos, andI. Pitas. Combining text and link analysis for focused crawling-an application for vertical search engines. Information Systems, 32(6):886--908, 2007 Google ScholarDigital Library
- L. Bolelli, S. Ertekin, and C. L. Giles. Clustering scientific literature using sparse citation graph analysis. In PKDD'06, pages 30--41, 2006. Google ScholarDigital Library
- M. Steyvers, P. Smyth, M. Rosen-Zvi, and T. Griffiths. Probabilistic author-topic models for information discovery. In KDD'04, pages 306---315, 2004. Google ScholarDigital Library
- C. D. X. He, H. Zha and H. Simon. Web document clustering using hyperlink structures. Computational Statistics and Data Analysis, 41:19--45, 2002. Google ScholarDigital Library
Index Terms
- Finding topic trends in digital libraries
Recommendations
Multi-granular document-level sentiment topic analysis for online reviews
AbstractIt is key to identify both sentiment and topic for well understanding and managing social media data such as online reviews and microblogs. This paper studies a robust and reliable solution for synchronous analysis of sentiment and topic in online ...
Joint sentiment/topic model for sentiment analysis
CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge managementSentiment analysis or opinion mining aims to use automated tools to detect subjective information such as opinions, attitudes, and feelings expressed in text. This paper proposes a novel probabilistic modeling framework based on Latent Dirichlet ...
Formal concept analysis for topic detection
We propose a novel application of FCA-based methods for Topic Detection, overcoming traditional problems of the clustering and classification techniques.We achieve state-of-the-art results for the topic detection task at Replab 2013.We propose an ...
Comments