skip to main content
10.1145/2063576.2063703acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Summarizing web forum threads based on a latent topic propagation process

Published: 24 October 2011 Publication History

Abstract

With an increasingly amount of information in web forums, quick comprehension of threads in web forums has become a challenging research problem. To handle this issue, this paper investigates the task of Web Forum Thread Summarization (WFTS), aiming to give a brief statement of each thread that involving multiple dynamic topics. When applied to the task of WFTS, traditional summarization methods are cramped by topic dependencies, topic drifting and text sparseness. Consequently, we explore an unsupervised topic propagation model in this paper, the Post Propagation Model (PPM), to burst through these problems by simultaneously modeling the semantics and the reply relationship existing in each thread. Each post in PPM is considered as a mixture of topics, and a product of Dirichlet distributions in previous posts is employed to model each topic dependencies during the asynchronous discussion. Based on this model, the task of WFTS is accomplished by extracting most significant sentences in a thread. The experimental results on two different forum data sets show that WFTS based on the PPM outperforms several state-of-the-art summarization methods in terms of ROUGE metrics.

References

[1]
D. Blei and J. Lafferty. Dynamic topic models. In ICML, 2006.
[2]
S. Dennis. On the hyper-Dirichlet type 1 and hyper-Liouville distributions. Communications in Statistics-Theory and Methods, 1991.
[3]
A. Haghighi and L. Vanderwende. Exploring content models for multi-document summarization. In HLT-ACL, 2009.
[4]
T. Haveliwala. Topic-sensitive pagerank: A context-sensitive ranking algorithm for web search. IEEE Transactions on Knowledge and Data Engineering, 2003.
[5]
X. Hu, N. Sun, C. Zhang, and T. Chua. Exploiting internal and external semantics for the clustering of short texts using world knowledge. In CIKM, 2009.
[6]
T. Iwata, S. Watanabe, T. Yamada, and N. Ueda. Topic tracking model for analyzing consumer purchase behavior. In IJCAI, 2009.
[7]
A. Jain, M. Murty, and P. Flynn. Data clustering: a review. ACM computing surveys, 1999.
[8]
C. Lin, J. Yang, R. Cai, X. Wang, and W. Wang. Simultaneously modeling semantics and structure of threaded discussions: a sparse coding approach and its applications. In SIGIR, 2009.
[9]
Y. Liu, A. Niculescu-Mizil, and W. Gryc. Topic-link LDA: joint models of topic and author community. In ICML, 2009.
[10]
D. Radev, H. Jing, M. Sty, and D. Tam. Centroid-based summarization of multiple documents. Information Processing & Management, 2004.
[11]
Z. Ren, J. Ma, G. Wang, C. Cui, and X. Han. Dynamically modeling semantic dependencies in web forum threads. In WI-IAT, 2011.
[12]
H. Wallach. Topic modeling: beyond bag-of-words. In ICML, 2006.
[13]
C. Wang, D. Blei, and D. Heckerman. Continuous time dynamic topic models. In UAI, 2008.
[14]
X. Wang and A. McCallum. Topics over time: a non-markov continuous-time model of topical trends. In SIGKDD, 2006.
[15]
F. Wei, W. Li, Q. Lu, and Y. He. Query-sensitive mutual reinforcement chain and its application in query-oriented multi-document summarization. In SIGIR, 2008.
[16]
H. Zha. Generic summarization and keyphrase extraction using mutual reinforcement principle and sentence clustering. In SIGIR, 2002.
[17]
L. Zhou and E. Hovy. Digesting virtual geek culture: The summarization of technical internet relay chats. In ACL, 2005.

Cited By

View all
  • (2019)Quality dimensions features for identifying high-quality user replies in text forum threads using classification methodsPLOS ONE10.1371/journal.pone.021551614:5(e0215516)Online publication date: 15-May-2019
  • (2019)Behavior Indicators for Sensemaking of Online Discussions2019 IEEE International Conference on Systems, Man and Cybernetics (SMC)10.1109/SMC.2019.8914182(1366-1371)Online publication date: Oct-2019
  • (2019)Mining of Relevant and Informative Posts from Text ForumsElectronic Governance and Open Society: Challenges in Eurasia10.1007/978-3-030-13283-5_12(154-168)Online publication date: 10-Feb-2019
  • Show More Cited By

Index Terms

  1. Summarizing web forum threads based on a latent topic propagation process

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CIKM '11: Proceedings of the 20th ACM international conference on Information and knowledge management
    October 2011
    2712 pages
    ISBN:9781450307178
    DOI:10.1145/2063576
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 24 October 2011

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. summarization
    2. thread comprehension
    3. topic modeling
    4. web forums

    Qualifiers

    • Research-article

    Conference

    CIKM '11
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

    Upcoming Conference

    CIKM '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)2
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 19 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2019)Quality dimensions features for identifying high-quality user replies in text forum threads using classification methodsPLOS ONE10.1371/journal.pone.021551614:5(e0215516)Online publication date: 15-May-2019
    • (2019)Behavior Indicators for Sensemaking of Online Discussions2019 IEEE International Conference on Systems, Man and Cybernetics (SMC)10.1109/SMC.2019.8914182(1366-1371)Online publication date: Oct-2019
    • (2019)Mining of Relevant and Informative Posts from Text ForumsElectronic Governance and Open Society: Challenges in Eurasia10.1007/978-3-030-13283-5_12(154-168)Online publication date: 10-Feb-2019
    • (2018)Creating a reference data set for the summarization of discussion forum threadsLanguage Resources and Evaluation10.1007/s10579-017-9389-452:2(461-483)Online publication date: 1-Jun-2018
    • (2018)Identifying intention posts in discussion forums using multi-instance learning and multiple sources transfer learningSoft Computing - A Fusion of Foundations, Methodologies and Applications10.1007/s00500-017-2755-822:24(8107-8118)Online publication date: 1-Dec-2018
    • (2018)Quality Features for Summarizing Text Forum Threads by Selecting Quality RepliesRecent Trends in Data Science and Soft Computing10.1007/978-3-319-99007-1_5(47-56)Online publication date: 9-Sep-2018
    • (2017)Summarizing Answers in Non-Factoid Community Question-AnsweringProceedings of the Tenth ACM International Conference on Web Search and Data Mining10.1145/3018661.3018704(405-414)Online publication date: 2-Feb-2017
    • (2016)Finding diverse needles in a haystack of commentsProceedings of the 8th ACM Conference on Web Science10.1145/2908131.2908168(286-290)Online publication date: 22-May-2016
    • (2016)A novel framework for social web forums' thread ranking based on semantics and post quality featuresThe Journal of Supercomputing10.1007/s11227-016-1839-z72:11(4276-4295)Online publication date: 1-Nov-2016
    • (2016)Automatic Clustering and Summarisation of Microblogs: A Multi-subtopic Phrase Reinforcement AlgorithmArtificial Life and Computational Intelligence10.1007/978-3-319-51691-2_8(86-98)Online publication date: 27-Dec-2016
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media