ABSTRACT
With the popularity of social media platforms such as Facebook and Twitter, the amount of useful data in these sources is rapidly increasing, making them promising places for information acquisition. This research aims at the customized organization of a social media corpus using focused topic hierarchy. It organizes the contents into different structures to meet with users' different information needs (e.g., "iPhone 5 problem" or "iPhone 5 camera"). To this end, we introduce a novel function to measure the likelihood of a topic hierarchy, by which the users' information need can be incorporated into the process of topic hierarchy construction. Using the structure information within the generated topic hierarchy, we then develop a probability based model to identify the representative contents for topics to assist users in document retrieval on the hierarchy. Experimental results on real world data illustrate the effectiveness of our method and its superiority over state-of-the-art methods for both information organization and retrieval tasks.
- K. Bade and A. Nürnberger. Creating a cluster hierarchy under constraints of a partially known hierarchy. In SDM'08, pages 13--24. SIAM.Google Scholar
- D. M. Blei, T. L. Griffiths, M. I. Jordan, and J. B. Tenenbaum. Hierarchical topic models and the nested chinese restaurant process. In NIPS'03.Google Scholar
- D. Chakrabarti, R. Kumar, and A. Tomkins. Evolutionary clustering. In SIGKDD'06, pages 554--560. ACM. Google ScholarDigital Library
- S.-L. Chuang and L.-F. Chien. A practical web-based approach to generating topic hierarchy for text segments. In CIKM '04, pages 127--136. ACM. Google ScholarDigital Library
- M. Danilevsky, C. Wang, F. Tao, S. Nguyen, G. Chen, N. Desai, L. Wang, and J. Han. Amethyst: A system for mining and exploring topical hierarchies of heterogeneous data. In KDD '13, pages 1458--1461. ACM. Google ScholarDigital Library
- I. Davidson and S. Ravi. Agglomerative hierarchical clustering with constraints: Theoretical and empirical results. In PKDD'05, pages 59--70. Springer. Google ScholarDigital Library
- J. Edmonds. Optimum branchings. Journal of Research of the National Bureau of Standards B, 71:233--240, 1967.Google ScholarCross Ref
- R. Fu, J. Guo, B. Qin, W. Che, H. Wang, and T. Liu. Learning semantic hierarchies via word embeddings. In ACL'14, volume 1.Google Scholar
- B. C. Fung, K. Wang, and M. Ester. Hierarchical document clustering using frequent itemsets. In SDM'03, volume 3, pages 59--70. SIAM.Google Scholar
- X. Han and J. Zhao. Structural semantic relatedness: a knowledge-based method to named entity disambiguation. In ACL'10, pages 50--59. ACL. Google ScholarDigital Library
- J. He, V. Hollink, and A. de Vries. Combining implicit and explicit topic representations for result diversification. In SIGIR'12, pages 851--860. ACM. Google ScholarDigital Library
- A. C. König and E. Brill. Reducing the human overhead in text categorization. In KDD '06, pages 598--603. ACM. Google ScholarDigital Library
- X. Liu, Y. Song, S. Liu, and H. Wang. Automatic taxonomy construction from keywords. In SIGKDD'12, pages 1433--1441. ACM. Google ScholarDigital Library
- O. Medelyan, S. Manion, J. Broekstra, A. Divoli, A.-L. Huang, and I. Witten. Constructing a focused taxonomy from a document collection. In P. Cimiano, O. Corcho, V. Presutti, L. Hollink, and S. Rudolph, editors, ESWC'13, volume 7882, pages 367--381. Springer Berlin Heidelberg.Google Scholar
- D. Mimno, W. Li, and A. McCallum. Mixtures of hierarchical topics with pachinko allocation. In ICML'07, pages 633--640. ACM. Google ScholarDigital Library
- Z.-Y. Ming, K. Wang, and T.-S. Chua. Prototype hierarchy based clustering for the categorization and navigation of web collections. In SIGIR '10, pages 2--9. ACM. Google ScholarDigital Library
- R. Navigli, P. Velardi, and S. Faralli. A graph-based algorithm for inducing lexical taxonomies from scratch. In IJCAI'11, pages 1872--1877. AAAI Press. Google ScholarDigital Library
- S. E. Robertson, S. Walker, S. Jones, M. M. Hancock-Beaulieu, M. Gatford, et al. Okapi at trec-3. NIST SPECIAL PUBLICATION SP, pages 109--109, 1995.Google Scholar
- M. Sanderson and B. Croft. Deriving concept hierarchies from text. In SIGIR '99, pages 206--213. ACM. Google ScholarDigital Library
- U. Scaiella, P. Ferragina, A. Marino, and M. Ciaramita. Topical clustering of search results. In WSDM'12, pages 223--232. ACM. Google ScholarDigital Library
- R. Snow, D. Jurafsky, and A. Y. Ng. Semantic taxonomy induction from heterogenous evidence. In ACL'06, pages 801--808. ACL. Google ScholarDigital Library
- C. Wang, M. Danilevsky, N. Desai, Y. Zhang, P. Nguyen, T. Taula, and J. Han. A phrase mining framework for recursive construction of a topical hierarchy. In SIGKDD' 13. Google ScholarDigital Library
- J. Wang, C. Kang, Y. Chang, and J. Han. A hierarchical dirichlet model for taxonomy expansion for search engines. Urbana, 51:61801.Google Scholar
- X. Wang, S. Liu, Y. Song, and B. Guo. Mining evolutionary multi-branch trees from text streams. In KDD '13, pages 722--730. ACM. Google ScholarDigital Library
- H. Yang and J. Callan. A metric-based framework for automatic taxonomy induction. In ACL'09, pages 271--279. ACL. Google ScholarDigital Library
- J. Yu, Z.-J. Zha, M. Wang, K. Wang, and T.-S. Chua. Domain-assisted product aspect hierarchy generation: towards hierarchical organization of unstructured consumer reviews. In EMNLP '11, pages 140--150. Association for Computational Linguistics. Google ScholarDigital Library
- H.-J. Zeng, Q.-C. He, Z. Chen, W.-Y. Ma, and J. Ma. Learning to cluster web search results. In SIGIR'04, pages 210--217. ACM. Google ScholarDigital Library
- Y. Zhao and G. Karypis. Evaluation of hierarchical clustering algorithms for document datasets. In CIKM'02, pages 515--524. ACM. Google ScholarDigital Library
- X. Zhu, Z.-Y. Ming, X. Zhu, and T.-S. Chua. Topic hierarchy construction for the organization of multi-source user generated contents. In SIGIR '13, pages 233--242. ACM. Google ScholarDigital Library
Index Terms
- Customized Organization of Social Media Contents using Focused Topic Hierarchy
Recommendations
Topic analysis for topic-focused multi-document summarization
CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge managementTopic-focused multi-document summarization has been a challenging task because the created summary is required to be biased to the given topic or query. Existing methods consider the given topic as a single coarse unit and then directly incorporate the ...
Using Cross-Document Random Walks for Topic-Focused Multi-Document
WI '06: Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web IntelligenceGraph-ranking based methods have been developed for generic multi-document summarization in recent years and they make uniform use of the relationships between sentences to extract salient sentences. This paper proposes to integrate the relevance of the ...
A joint model for sentiment-aware topic detection on social media
ECAI'16: Proceedings of the Twenty-second European Conference on Artificial IntelligenceJoint sentiment/topic models are widely applied in detecting sentiment-aware topics on the lengthy review data and they are achieved with Latent Dirichlet Allocation (LDA) based model. Nowadays plenty of user-generated posts, e.g., tweets and E-commerce ...
Comments