skip to main content
10.1145/1183614.1183703acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
Article

A system for query-specific document summarization

Published: 06 November 2006 Publication History

Abstract

There has been a great amount of work on query-independent summarization of documents. However, due to the success of Web search engines query-specific document summarization (query result snippets) has become an important problem, which has received little attention. We present a method to create query-specific summaries by identifying the most query-relevant fragments and combining them using the semantic associations within the document. In particular, we first add structure to the documents in the preprocessing stage and convert them to document graphs. Then, the best summaries are computed by calculating the top spanning trees on the document graphs. We present and experimentally evaluate efficient algorithms that support computing summaries in interactive time. Furthermore, the quality of our summarization method is compared to current approaches using a user survey.

References

[1]
J.Abracos and G. Pereira-Lopes. Statistical methods for retrieving most significant paragraphs in newspaper articles. In ACL/EACL Workshop on Intelligent Scalable Text Summarization, 1997.
[2]
S. Agrawal, S. Chaudhuri, and G. Das. DBXplorer: A System For Keyword-Based Search Over Relational Databases. ICDE, 2002.
[3]
E. Amitay, C. Paris: Automatically Summarizing Web Sites - Is there any way around it? CIKM, 2000.
[4]
A. Balmin, V. Hristidis, Y. Papakonstantinou: Authority-Based Keyword Queries in Databases using ObjectRank. VLDB, 2004.
[5]
R. Barzilay and M. Elhadad: Using lexical chains for text summarization. ISTS, 1997.
[6]
A. L. Berger and V. O. Mittal, OCELOT: A System for summarizing web pages. SIGIR, 2000.
[7]
G. Bhalotia, C. Nakhe, A. Hulgeri, S. Chakrabarti and S. Sudarshan: Keyword Searching and Browsing in Databases using BANKS. ICDE, 2002.
[8]
P. Buneman, S. Davidson, M. Fernandez, D. Suciu "Adding Structure to Unstructured Data". ICDM, 2003.
[9]
D. Cai, X. He, J. Wen, W. Ma: Block-level Link Analysis. SIGIR, 2004.
[10]
H. H. Chen, J. J. Kuo, and T. C. Su: Clustering and Visualization in a Multi-Lingual Multi- Document Summarization System. ECIR, 2003.
[11]
Document Understanding Conference http://duc.nist.gov, 2002.
[12]
H. P. Edmundson: New Methods in Automatic Abstracting. ACM Journal, 1969 G. Erkan and D. R. Radev. Lexrank: Graph-based centrality as salience in text summarization. JAIR, 2004.
[13]
G. Erkan and D. R. Radev. Lexrank: Graph-based centrality as salience in text summarization. JAIR, 2004.
[14]
T. Fukusima and M. Okumura: Text Summarization Challenge Text Summarization Evaluation in Japan. WAS, 2001.
[15]
R. Goldman, N. Shivakumar, S. Venkatasubramanian, H. Garcia-Molina: Proximity Search in Databases. VLDB, 1998.
[16]
J. Goldstein, M. Kantrowitz, V. Mittal, J. Carbonell: Summarizing text documents: Sentence selection and evaluation metrics. ACM SIGIR, 1999.
[17]
Google Desktop search http://desktop.google.com/
[18]
L. Guo, F. Shao, C. Botev, and J. Shanmugasundaram. XRANK: Ranked Keyword Search over XML Documents. ACM SIGMOD, 2003.
[19]
M. A. Hearst. Using categories to provide context for full-text retrieval results. In Proceedings of the RIAO, 1994.
[20]
E. Hovy and C. Y. Lin: The automated acquisition of topic signatures for text summarization. ICCL, 2000.
[21]
V. Hristidis, L. Gravano, Y. Papakonstantinou: Efficient IR-Style Keyword Search over Relational Databases. VLDB, 2003.
[22]
V. Hristidis, Y. Papakonstantinou: DISCOVER: Keyword Search in Relational Databases. VLDB, 2002.
[23]
V. Hristidis, Y. Papakonstantinou, A. Balmin: Keyword Proximity Search on XML Graphs. ICDE, 2003.
[24]
V. Kacholia, S. Pandit, S. Chakrabarti, S. Sudarshan, R. Desai, H. Karambelkar. Bidirectional Expansion For Keyword Search on Graph Databases. VLDB, 2005.
[25]
J. Kupiec, J. Pederson, and F. Chen: A Trainable Document Summarizer, SIGIR, 1995
[26]
C. H. Lee, M. Y. Kan, S. Lai: Stylistic and Lexical Co-training for Web Block Classification. WIDM, 2004.
[27]
C. Y. Lin: Improving Summarization Performance by Sentence Compression - A Pilot Study. IRAL, 2003.
[28]
W. S. Li, K. S. Candan, Q. Vu, and D. Agrawal: Retrieving and Organizing Web Pages by "Information Unit", WWW, 2001.
[29]
C. Y. Lin and E. Hovy. Identifying topics by position. In Proceedings of the ACL Conference on Applied Natural Language Processing, 1997.
[30]
D. Marcu. Discourse trees are good indicators of importance in text. Advances in Automatic Text Summarization, 1999.
[31]
D. Marcu. The rhetorical parsing of natural language texts. In Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics, 1997.
[32]
R. Mihalcea, P. Tarau, TextRank: Bringing Order into Texts, EMNLP 2004.
[33]
MSN Desktop search http://toolbar.msn.com/
[34]
Oracle interMedia http://www.oracle.com/technology/products/intermedia, 2005.
[35]
D. R. Radev and K. R. McKeown: Generating Natural Language Summaries from Multiple On-line Sources. Computational Linguistics, 1998.
[36]
D. R. Radev, W. Fan, Z. Zhang: WebInEssence: A Personalized Web-Based Multi-Document Summarization and Recommendation System. NAACL Workshop on Automatic Summarization, 2001.
[37]
P. W. G. Reich. Beyond Steiner's Problem: A VLSI Oriented Generalization. Workshop on Graph-Theoretic Concepts in Computer Science, 1989.
[38]
G. Salton, A. Singhal, C. Buckley, M. Mitra. Automatic text decomposition using text segments and text themes. Hypertext, 1996.
[39]
G. Salton, A. Singhal, M. Mitra, and C. Buckley: Automatic text structuring and summarization. Information Processing and Management, 1997.
[40]
A. Singhal: Modern Information Retrieval: A Brief Overview, Google, IEEE Data Eng. Bull, 2001.
[41]
R. Song, H. Liu, J. Wen, W. Ma: Learning Block Importance Models for Web Pages. WWW, 2004.
[42]
T. Strzalkowski, G. Stein, J. Wang, and B, Wise. A Robust Practical Text Summarizer. In I. Mani and M. Maybury (eds), Advances in Automatic Text Summarization, 1999.
[43]
A. Tombros, M. Sanderson. Advantages of Query Biased Summaries in Information Retrieval. SIGIR 1998.
[44]
R. Varadarajan, V Hristidis: Structure-Based Query-Specific Document Summarization. Poster paper at CIKM 2005.
[45]
R. W. White, I. Ruthven and J. M. Jose: Finding Relevant Documents using Top Ranking Sentences: An Evaluation of Two Alternative Schemes, SIGIR, 2002.
[46]
M. White, T. Korelsky, C. Cardie, V. Ng, D. Pierce, and K. Wagstaff.: Multidocument Summarization via Information Extraction. HLT, 2001.
[47]
K. Zechner. Fast generation of abstracts from general domain text corpora by extracting relevant sentences. In Proceedings of the International Conference on Computational Linguistics, 1996.

Cited By

View all
  • (2024)Features in extractive supervised single-document summarization: case of Persian newsLanguage Resources and Evaluation10.1007/s10579-024-09739-758:4(1073-1091)Online publication date: 8-May-2024
  • (2023)Automatic Document Summarization of Unilingual Documents: A ReviewIntelligent Computing and Optimization10.1007/978-3-031-50327-6_36(345-358)Online publication date: 16-Dec-2023
  • (2022)Contextual Word Embedding based Clustering for Extractive Summarization2022 International Conference on Frontiers of Information Technology (FIT)10.1109/FIT57066.2022.00039(165-170)Online publication date: Dec-2022
  • Show More Cited By

Index Terms

  1. A system for query-specific document summarization

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CIKM '06: Proceedings of the 15th ACM international conference on Information and knowledge management
    November 2006
    916 pages
    ISBN:1595934332
    DOI:10.1145/1183614
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 06 November 2006

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Steiner tree problem
    2. keyword search
    3. query-specific summarization
    4. user survey

    Qualifiers

    • Article

    Conference

    CIKM06
    CIKM06: Conference on Information and Knowledge Management
    November 6 - 11, 2006
    Virginia, Arlington, USA

    Acceptance Rates

    Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

    Upcoming Conference

    CIKM '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)12
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 20 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Features in extractive supervised single-document summarization: case of Persian newsLanguage Resources and Evaluation10.1007/s10579-024-09739-758:4(1073-1091)Online publication date: 8-May-2024
    • (2023)Automatic Document Summarization of Unilingual Documents: A ReviewIntelligent Computing and Optimization10.1007/978-3-031-50327-6_36(345-358)Online publication date: 16-Dec-2023
    • (2022)Contextual Word Embedding based Clustering for Extractive Summarization2022 International Conference on Frontiers of Information Technology (FIT)10.1109/FIT57066.2022.00039(165-170)Online publication date: Dec-2022
    • (2018)Empirical Analysis of Single and Multi Document Summarization using Clustering AlgorithmsEngineering, Technology & Applied Science Research10.48084/etasr.17758:1(2562-2567)Online publication date: 20-Feb-2018
    • (2018)Determining Information Relevance Based on Personalization Techniques to Meet Specific User NeedsBusiness Information Systems and Technology 4.010.1007/978-3-319-74322-6_3(31-45)Online publication date: 7-Mar-2018
    • (2017)Query-based multi-documents summarization using linguistic knowledge and content word expansionSoft Computing - A Fusion of Foundations, Methodologies and Applications10.1007/s00500-015-1881-421:7(1785-1801)Online publication date: 1-Apr-2017
    • (2016)A Comparative Study of Query-biased and Non-redundant Snippets for Structured Search on Mobile DevicesProceedings of the 25th ACM International on Conference on Information and Knowledge Management10.1145/2983323.2983699(2389-2394)Online publication date: 24-Oct-2016
    • (2016)An overview of Text Summarization techniques2016 International Conference on Computing Communication Control and automation (ICCUBEA)10.1109/ICCUBEA.2016.7860024(1-7)Online publication date: Aug-2016
    • (2016)NBLucene: Flexible and Efficient Open Source Search EngineWeb-Age Information Management10.1007/978-3-319-39937-9_39(504-516)Online publication date: 28-May-2016
    • (2016)Enrichment of Academic Search Engine Results Pages by Citation-Based GraphsInformation Retrieval Technology10.1007/978-3-319-28940-3_5(56-67)Online publication date: 22-Jan-2016
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media