Article

A system for query-specific document summarization

Authors:

Ramakrishna Varadarajan,

Vagelis HristidisAuthors Info & Claims

CIKM '06: Proceedings of the 15th ACM international conference on Information and knowledge management

Pages 622 - 631

https://doi.org/10.1145/1183614.1183703

Published: 06 November 2006 Publication History

Abstract

There has been a great amount of work on query-independent summarization of documents. However, due to the success of Web search engines query-specific document summarization (query result snippets) has become an important problem, which has received little attention. We present a method to create query-specific summaries by identifying the most query-relevant fragments and combining them using the semantic associations within the document. In particular, we first add structure to the documents in the preprocessing stage and convert them to document graphs. Then, the best summaries are computed by calculating the top spanning trees on the document graphs. We present and experimentally evaluate efficient algorithms that support computing summaries in interactive time. Furthermore, the quality of our summarization method is compared to current approaches using a user survey.

References

[1]

J.Abracos and G. Pereira-Lopes. Statistical methods for retrieving most significant paragraphs in newspaper articles. In ACL/EACL Workshop on Intelligent Scalable Text Summarization, 1997.

[2]

S. Agrawal, S. Chaudhuri, and G. Das. DBXplorer: A System For Keyword-Based Search Over Relational Databases. ICDE, 2002.

Digital Library

[3]

E. Amitay, C. Paris: Automatically Summarizing Web Sites - Is there any way around it? CIKM, 2000.

Digital Library

[4]

A. Balmin, V. Hristidis, Y. Papakonstantinou: Authority-Based Keyword Queries in Databases using ObjectRank. VLDB, 2004.

Digital Library

[5]

R. Barzilay and M. Elhadad: Using lexical chains for text summarization. ISTS, 1997.

[6]

A. L. Berger and V. O. Mittal, OCELOT: A System for summarizing web pages. SIGIR, 2000.

Digital Library

[7]

G. Bhalotia, C. Nakhe, A. Hulgeri, S. Chakrabarti and S. Sudarshan: Keyword Searching and Browsing in Databases using BANKS. ICDE, 2002.

Digital Library

[8]

P. Buneman, S. Davidson, M. Fernandez, D. Suciu "Adding Structure to Unstructured Data". ICDM, 2003.

Digital Library

[9]

D. Cai, X. He, J. Wen, W. Ma: Block-level Link Analysis. SIGIR, 2004.

Digital Library

[10]

H. H. Chen, J. J. Kuo, and T. C. Su: Clustering and Visualization in a Multi-Lingual Multi- Document Summarization System. ECIR, 2003.

Digital Library

[11]

Document Understanding Conference http://duc.nist.gov, 2002.

[12]

H. P. Edmundson: New Methods in Automatic Abstracting. ACM Journal, 1969 G. Erkan and D. R. Radev. Lexrank: Graph-based centrality as salience in text summarization. JAIR, 2004.

Digital Library

[13]

G. Erkan and D. R. Radev. Lexrank: Graph-based centrality as salience in text summarization. JAIR, 2004.

Digital Library

[14]

T. Fukusima and M. Okumura: Text Summarization Challenge Text Summarization Evaluation in Japan. WAS, 2001.

Digital Library

[15]

R. Goldman, N. Shivakumar, S. Venkatasubramanian, H. Garcia-Molina: Proximity Search in Databases. VLDB, 1998.

Digital Library

[16]

J. Goldstein, M. Kantrowitz, V. Mittal, J. Carbonell: Summarizing text documents: Sentence selection and evaluation metrics. ACM SIGIR, 1999.

Digital Library

[17]

Google Desktop search http://desktop.google.com/

[18]

L. Guo, F. Shao, C. Botev, and J. Shanmugasundaram. XRANK: Ranked Keyword Search over XML Documents. ACM SIGMOD, 2003.

Digital Library

[19]

M. A. Hearst. Using categories to provide context for full-text retrieval results. In Proceedings of the RIAO, 1994.

[20]

E. Hovy and C. Y. Lin: The automated acquisition of topic signatures for text summarization. ICCL, 2000.

Digital Library

[21]

V. Hristidis, L. Gravano, Y. Papakonstantinou: Efficient IR-Style Keyword Search over Relational Databases. VLDB, 2003.

Digital Library

[22]

V. Hristidis, Y. Papakonstantinou: DISCOVER: Keyword Search in Relational Databases. VLDB, 2002.

Digital Library

[23]

V. Hristidis, Y. Papakonstantinou, A. Balmin: Keyword Proximity Search on XML Graphs. ICDE, 2003.

Digital Library

[24]

V. Kacholia, S. Pandit, S. Chakrabarti, S. Sudarshan, R. Desai, H. Karambelkar. Bidirectional Expansion For Keyword Search on Graph Databases. VLDB, 2005.

Digital Library

[25]

J. Kupiec, J. Pederson, and F. Chen: A Trainable Document Summarizer, SIGIR, 1995

Digital Library

[26]

C. H. Lee, M. Y. Kan, S. Lai: Stylistic and Lexical Co-training for Web Block Classification. WIDM, 2004.

Digital Library

[27]

C. Y. Lin: Improving Summarization Performance by Sentence Compression - A Pilot Study. IRAL, 2003.

Digital Library

[28]

W. S. Li, K. S. Candan, Q. Vu, and D. Agrawal: Retrieving and Organizing Web Pages by "Information Unit", WWW, 2001.

Digital Library

[29]

C. Y. Lin and E. Hovy. Identifying topics by position. In Proceedings of the ACL Conference on Applied Natural Language Processing, 1997.

Digital Library

[30]

D. Marcu. Discourse trees are good indicators of importance in text. Advances in Automatic Text Summarization, 1999.

Digital Library

[31]

D. Marcu. The rhetorical parsing of natural language texts. In Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics, 1997.

Digital Library

[32]

R. Mihalcea, P. Tarau, TextRank: Bringing Order into Texts, EMNLP 2004.

[33]

MSN Desktop search http://toolbar.msn.com/

[34]

Oracle interMedia http://www.oracle.com/technology/products/intermedia, 2005.

[35]

D. R. Radev and K. R. McKeown: Generating Natural Language Summaries from Multiple On-line Sources. Computational Linguistics, 1998.

Digital Library

[36]

D. R. Radev, W. Fan, Z. Zhang: WebInEssence: A Personalized Web-Based Multi-Document Summarization and Recommendation System. NAACL Workshop on Automatic Summarization, 2001.

[37]

P. W. G. Reich. Beyond Steiner's Problem: A VLSI Oriented Generalization. Workshop on Graph-Theoretic Concepts in Computer Science, 1989.

Digital Library

[38]

G. Salton, A. Singhal, C. Buckley, M. Mitra. Automatic text decomposition using text segments and text themes. Hypertext, 1996.

Digital Library

[39]

G. Salton, A. Singhal, M. Mitra, and C. Buckley: Automatic text structuring and summarization. Information Processing and Management, 1997.

Digital Library

[40]

A. Singhal: Modern Information Retrieval: A Brief Overview, Google, IEEE Data Eng. Bull, 2001.

[41]

R. Song, H. Liu, J. Wen, W. Ma: Learning Block Importance Models for Web Pages. WWW, 2004.

Digital Library

[42]

T. Strzalkowski, G. Stein, J. Wang, and B, Wise. A Robust Practical Text Summarizer. In I. Mani and M. Maybury (eds), Advances in Automatic Text Summarization, 1999.

[43]

A. Tombros, M. Sanderson. Advantages of Query Biased Summaries in Information Retrieval. SIGIR 1998.

Digital Library

[44]

R. Varadarajan, V Hristidis: Structure-Based Query-Specific Document Summarization. Poster paper at CIKM 2005.

Digital Library

[45]

R. W. White, I. Ruthven and J. M. Jose: Finding Relevant Documents using Top Ranking Sentences: An Evaluation of Two Alternative Schemes, SIGIR, 2002.

Digital Library

[46]

M. White, T. Korelsky, C. Cardie, V. Ng, D. Pierce, and K. Wagstaff.: Multidocument Summarization via Information Extraction. HLT, 2001.

Digital Library

[47]

K. Zechner. Fast generation of abstracts from general domain text corpora by extracting relevant sentences. In Proceedings of the International Conference on Computational Linguistics, 1996.

Digital Library

Cited By

Rezaei HMirhosseini SShahgholian ASaraee M(2024)Features in extractive supervised single-document summarization: case of Persian newsLanguage Resources and Evaluation10.1007/s10579-024-09739-758:4(1073-1091)Online publication date: 8-May-2024
https://doi.org/10.1007/s10579-024-09739-7
Anan SIslam NAli MBhuiyan TBijoy MReza AArefin M(2023)Automatic Document Summarization of Unilingual Documents: A ReviewIntelligent Computing and Optimization10.1007/978-3-031-50327-6_36(345-358)Online publication date: 16-Dec-2023
https://doi.org/10.1007/978-3-031-50327-6_36
Faisal SKhan AYousaf SUmair M(2022)Contextual Word Embedding based Clustering for Extractive Summarization2022 International Conference on Frontiers of Information Technology (FIT)10.1109/FIT57066.2022.00039(165-170)Online publication date: Dec-2022
https://doi.org/10.1109/FIT57066.2022.00039
Show More Cited By

Index Terms

A system for query-specific document summarization
1. Information systems
  1. Information retrieval
    1. Information retrieval query processing

Recommendations

Structure-based query-specific document summarization
CIKM '05: Proceedings of the 14th ACM international conference on Information and knowledge management

Summarization of text documents is increasingly important with the amount of data available on the Internet. The large majority of current approaches view documents as linear sequences of words and create query-independent summaries. However, ignoring ...
Context-sensitive document ranking
CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge management

Ranking is a main research issue in IR-styled keyword search over a set of documents. In this paper, we study a new keyword search problem, called context-sensitive document ranking, which is to rank documents with an additional context that provides ...
Keyword Query Routing

Keyword search is an intuitive paradigm for searching linked data sources on the web. We propose to route keywords only to relevant sources to reduce the high cost of processing keyword search queries over all sources. We propose a novel method for ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

CIKM '06: Proceedings of the 15th ACM international conference on Information and knowledge management

November 2006

916 pages

ISBN:1595934332

DOI:10.1145/1183614

General Chair:
Philip S. Yu
IBM T.J. Watson Research Center (USA)
,
Program Chairs:
Vassilis Tsotras
University of California-Riverside (USA)
,
Edward Fox
Virginia Tech (USA)
,
Bing Liu
University of Illinois at Chicago (USA)

Copyright © 2006 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 06 November 2006

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Article

Conference

CIKM06

Sponsor:

CIKM06: Conference on Information and Knowledge Management

November 6 - 11, 2006

Virginia, Arlington, USA

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Sponsor:
sigir
sigir

The 34th ACM International Conference on Information and Knowledge Management

November 10 - 14, 2025

Seoul , Republic of Korea

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

50
Total Citations
View Citations
1,067
Total Downloads

Downloads (Last 12 months)12
Downloads (Last 6 weeks)0

Reflects downloads up to 20 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Rezaei HMirhosseini SShahgholian ASaraee M(2024)Features in extractive supervised single-document summarization: case of Persian newsLanguage Resources and Evaluation10.1007/s10579-024-09739-758:4(1073-1091)Online publication date: 8-May-2024
https://doi.org/10.1007/s10579-024-09739-7
Anan SIslam NAli MBhuiyan TBijoy MReza AArefin M(2023)Automatic Document Summarization of Unilingual Documents: A ReviewIntelligent Computing and Optimization10.1007/978-3-031-50327-6_36(345-358)Online publication date: 16-Dec-2023
https://doi.org/10.1007/978-3-031-50327-6_36
Faisal SKhan AYousaf SUmair M(2022)Contextual Word Embedding based Clustering for Extractive Summarization2022 International Conference on Frontiers of Information Technology (FIT)10.1109/FIT57066.2022.00039(165-170)Online publication date: Dec-2022
https://doi.org/10.1109/FIT57066.2022.00039
Bewoor MPatil S(2018)Empirical Analysis of Single and Multi Document Summarization using Clustering AlgorithmsEngineering, Technology & Applied Science Research10.48084/etasr.17758:1(2562-2567)Online publication date: 20-Feb-2018
https://doi.org/10.48084/etasr.1775
Thönssen BWitschel HRusinov O(2018)Determining Information Relevance Based on Personalization Techniques to Meet Specific User NeedsBusiness Information Systems and Technology 4.010.1007/978-3-319-74322-6_3(31-45)Online publication date: 7-Mar-2018
https://doi.org/10.1007/978-3-319-74322-6_3
Abdi AIdris NAlguliyev RAliguliyev R(2017)Query-based multi-documents summarization using linguistic knowledge and content word expansionSoft Computing - A Fusion of Foundations, Methodologies and Applications10.1007/s00500-015-1881-421:7(1785-1801)Online publication date: 1-Apr-2017
https://dl.acm.org/doi/10.1007/s00500-015-1881-4
Spirin NKotov AKarahalios KMladenov VIzhutov PMukhopadhyay SZhai CBertino ECrestani FMostafa JTang JSi LZhou XChang YLi YSondhi P(2016)A Comparative Study of Query-biased and Non-redundant Snippets for Structured Search on Mobile DevicesProceedings of the 25th ACM International on Conference on Information and Knowledge Management10.1145/2983323.2983699(2389-2394)Online publication date: 24-Oct-2016
https://dl.acm.org/doi/10.1145/2983323.2983699
Andhale NBewoor L(2016)An overview of Text Summarization techniques2016 International Conference on Computing Communication Control and automation (ICCUBEA)10.1109/ICCUBEA.2016.7860024(1-7)Online publication date: Aug-2016
https://doi.org/10.1109/ICCUBEA.2016.7860024
Zhang ZYe BHuang JStones RWang GLiu X(2016)NBLucene: Flexible and Efficient Open Source Search EngineWeb-Age Information Management10.1007/978-3-319-39937-9_39(504-516)Online publication date: 28-May-2016
https://doi.org/10.1007/978-3-319-39937-9_39
Shogen SShimizu TYoshikawa M(2016)Enrichment of Academic Search Engine Results Pages by Citation-Based GraphsInformation Retrieval Technology10.1007/978-3-319-28940-3_5(56-67)Online publication date: 22-Jan-2016
https://doi.org/10.1007/978-3-319-28940-3_5
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten