skip to main content
10.1145/1835449.1835657acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
poster

Diversification of search results using webgraphs

Published: 19 July 2010 Publication History

Abstract

A set of words is often insufficient to express a user's information need. In order to account for various information needs associated with a query, diversification seems to be a reasonable strategy. By diversifying the result set, we increase the probability of results being relevant to the user's information needs when the given query is ambiguous. A diverse result set must contain a set of documents that cover various subtopics for a given query. We propose a graph based method which exploits the link structure of the web to return a ranked list that provides complete coverage for a query. Our method not only provides diversity to the results set, but also avoids excessive redundancy. Moreover, the probability of relevance of a document is conditioned on the documents that appear before it in the result list. We show the effectiveness of our method by comparing it with a query-likelihood model as the baseline.

References

[1]
Clarke, Charles L.A. and Kolla, Maheedhar and Cormack, Gordon V. and Vechtomova, Olga and Ashkan, Azin and Buttcher, Stefan and MacKinnon, Ian. Novelty and diversity in information retrieval evaluation. In Proceedings of SIGIR '08. 659--666, Singapore, Singapore, http://doi.acm.org/10.1145/1390334.1390446.
[2]
Clarke, Charles L.A. and Craswell, Nick and Soboroff, Ian. Overview of the TREC 2009 Web Track. in Proceedings of TREC, 2009.
[3]
Carbonell, Jaime and Goldstein, Jade. The use of MMR, diversity-based reranking for reordering documents and producing summaries. In Proceedings of SIGIR '98, 1998, 1-58113-015-5}, 335--336, Melbourne, Australia.
[4]
Agrawal, Rakesh and Gollapudi, Sreenivas and Halverson, Alan and Ieong, Samuel. Diversifying search results. In Proceedings of WSDM '09, 2009, 978-1-60558-390-7, 5--14, Barcelona, Spain.
[5]
Zhai, Cheng Xiang and Cohen, William W. and Lafferty, John. Beyond independent relevance: methods and evaluation metrics for subtopic retrieval. In Proceedings of SIGIR '03, 2003, 1-58113-646-3, 10--17, Toronto, Canada.
[6]
Carterette, Ben and Chandar, Praveen. Probabilistic models of ranking novel documents for faceted topic retrieval. In Proceedings of CIKM '09, 2009, 978-1-60558-512-3, 1287--1296, Hong Kong, China.
[7]
Kleinberg, Jon M. Authoritative sources in a hyperlinked environment. J. ACM, 46, 5, 1999, 0004-5411, 604--632.

Cited By

View all
  • (2016)Optimizing top-k retrievalFrontiers of Computer Science: Selected Publications from Chinese Universities10.1007/s11704-015-5222-710:3(477-487)Online publication date: 1-Jun-2016
  • (2014)Diversifying Top-k Service RetrievalProceedings of the 2014 IEEE International Conference on Services Computing10.1109/SCC.2014.38(227-234)Online publication date: 27-Jun-2014
  • (2014)Optimizing Top-k Retrieval: Submodularity Analysis and Search StrategiesWeb-Age Information Management10.1007/978-3-319-08010-9_3(18-29)Online publication date: 2014
  • Show More Cited By

Index Terms

  1. Diversification of search results using webgraphs

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGIR '10: Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
    July 2010
    944 pages
    ISBN:9781450301534
    DOI:10.1145/1835449
    Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 19 July 2010

    Check for updates

    Author Tags

    1. diversity
    2. information retrieval
    3. webgraphs

    Qualifiers

    • Poster

    Conference

    SIGIR '10
    Sponsor:

    Acceptance Rates

    SIGIR '10 Paper Acceptance Rate 87 of 520 submissions, 17%;
    Overall Acceptance Rate 792 of 3,983 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)1
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 18 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2016)Optimizing top-k retrievalFrontiers of Computer Science: Selected Publications from Chinese Universities10.1007/s11704-015-5222-710:3(477-487)Online publication date: 1-Jun-2016
    • (2014)Diversifying Top-k Service RetrievalProceedings of the 2014 IEEE International Conference on Services Computing10.1109/SCC.2014.38(227-234)Online publication date: 27-Jun-2014
    • (2014)Optimizing Top-k Retrieval: Submodularity Analysis and Search StrategiesWeb-Age Information Management10.1007/978-3-319-08010-9_3(18-29)Online publication date: 2014
    • (2013)Clustering and Diversifying Web Search Results with Graph-Based Word Sense InductionComputational Linguistics10.1162/COLI_a_0014839:3(709-754)Online publication date: Sep-2013
    • (2013)Mining subtopics from text fragments for a web queryInformation Retrieval10.1007/s10791-013-9221-816:4(484-503)Online publication date: 27-Feb-2013
    • (2012)Top-k retrieval using facility location analysisProceedings of the 34th European conference on Advances in Information Retrieval10.1007/978-3-642-28997-2_26(305-316)Online publication date: 1-Apr-2012

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media