skip to main content
10.1145/1772690.1772780acmotherconferencesArticle/Chapter ViewAbstractPublication PageswwwConference Proceedingsconference-collections
research-article

Exploiting query reformulations for web search result diversification

Published:26 April 2010Publication History

ABSTRACT

When a Web user's underlying information need is not clearly specified from the initial query, an effective approach is to diversify the results retrieved for this query. In this paper, we introduce a novel probabilistic framework for Web search result diversification, which explicitly accounts for the various aspects associated to an underspecified query. In particular, we diversify a document ranking by estimating how well a given document satisfies each uncovered aspect and the extent to which different aspects are satisfied by the ranking as a whole. We thoroughly evaluate our framework in the context of the diversity task of the TREC 2009 Web track. Moreover, we exploit query reformulations provided by three major Web search engines (WSEs) as a means to uncover different query aspects. The results attest the effectiveness of our framework when compared to state-of-the-art diversification approaches in the literature. Additionally, by simulating an upper-bound query reformulation mechanism from official TREC data, we draw useful insights regarding the effectiveness of the query reformulations generated by the different WSEs in promoting diversity.

References

  1. R. Agrawal, S. Gollapudi, A. Halverson, and S. Ieong. Diversifying search results. In Proc. of WSDM, pages 5--14, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. G. Amati, E. Ambrosi, M. Bianchi, C. Gaibisso, and G. Gambosi. FUB, IASI-CNR and University of Tor Vergata at TREC 2007 Blog track. In Proc. of TREC, 2007.Google ScholarGoogle Scholar
  3. R. A. Baeza-Yates, C. A. Hurtado, and M. Mendoza. Query recommendation using query logs in search engines. In Proc. of EDBT Workshops, pages 588--596, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. P. Boldi, F. Bonchi, C. Castillo, and S. Vigna. From 'Dango' to 'Japanese cakes': query reformulation models and patterns. In Proc. of WI--IAT, pages 183--190, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. J. Carbonell and J. Goldstein. The use of MMR, diversity-based reranking for reordering documents and producing summaries. In Proc. of SIGIR, pages 335--336, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. B. Carterette. An analysis of NP-completeness in novelty and diversity ranking. In Proc. of ICTIR, pages 200--211, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. B. Carterette and P. Chandar. Probabilistic models of ranking novel documents for faceted topic retrieval. In Proc. of CIKM, pages 1287--1296, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. H. Chen and D. R. Karger. Less is more: probabilistic models for retrieving fewer relevant documents. In Proc. of SIGIR, pages 429--436, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. C. L. A. Clarke, N. Craswell, and I. Soboroff. Preliminary report on the TREC 2009 Web track. In Proc. of TREC, 2009.Google ScholarGoogle Scholar
  10. C. L. A. Clarke, M. Kolla, G. V. Cormack, O. Vechtomova, A. Ashkan, S. Büttcher, and I. MacKinnon. Novelty and diversity in information retrieval evaluation. In Proc. of SIGIR, pages 659--666, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. C. L. A. Clarke, M. Kolla, and O. Vechtomova. An effectiveness measure for ambiguous and underspecified queries. In Proc. of ICTIR, pages 188--199, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. W. S. Cooper. The inadequacy of probability of usefulness as a ranking criterion for retrieval system output. Technical report, Univ. of California, 1971.Google ScholarGoogle Scholar
  13. W. Goffman. On relevance as a measure. IP&M, 2(3):201--203, 1964.Google ScholarGoogle Scholar
  14. S. Gollapudi and A. Sharma. An axiomatic approach for result diversification. In Proc. of WWW, pages 381--390, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. B. He, C. Macdonald, I. Ounis, J. Peng, and R. L. T. Santos. University of Glasgow at TREC 2008: experiments in Blog, Enterprise, and Relevance Feedback tracks with Terrier. In Proc. of TREC, 2008.Google ScholarGoogle Scholar
  16. M. A. Hearst. Search User Interfaces. Cambridge University Press, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. D. Hiemstra. Using Language Models for Information Retrieval. PhD thesis, Univ. of Twente, 2001.Google ScholarGoogle Scholar
  18. D. S. Hochbaum, editor. Approximation algorithms for NP-hard problems. PWS Publishing Co., 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. B. J. Jansen, A. Spink, J. Bateman, and T. Saracevic. Real life information retrieval: a study of user queries on the Web. SIGIR Forum, 32(1):5--17, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. K. Jarvelin and J. Kekalainen. Cumulated gain-based evaluation of IR techniques. ACM TOIS, 20(4):422--446, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. I. Ounis, G. Amati, V. Plachouras, B. He, C. Macdonald, and C. Lioma. Terrier: a high performance and scalable information retrieval platform. In Proc. of SIGIR, OSIR Workshop, 2006.Google ScholarGoogle Scholar
  22. J. Peng, C. Macdonald, B. He, V. Plachouras, and I. Ounis. Incorporating term dependency in the DFR framework. In Proc. of SIGIR, pages 843--844, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. F. Radlinski and S. Dumais. Improving personalized web search using result diversification. In Proc. of SIGIR, pages 691--692, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. S. E. Robertson. The probability ranking principle in IR. Journal of Documentation, 33(4):294--304, 1977.Google ScholarGoogle ScholarCross RefCross Ref
  25. S. E. Robertson, S. Walker, S. Jones, M. Hancock-Beaulieu, and M. Gatford. Okapi at TREC-3. In Proc. of TREC, 1994.Google ScholarGoogle Scholar
  26. J. J. Rocchio. Relevance feedback in information retrieval. In The SMART Retrieval System, pages 313--323. 1971.Google ScholarGoogle Scholar
  27. R. L. T. Santos, J. Peng, C. Macdonald, and I. Ounis. Explicit search result diversification through sub-queries. In Proc. of ECIR, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. M. Shokouhi. Central-rank-based collection selection in uncooperative distributed information retrieval. In Proc. of ECIR, pages 160--172, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. K. Sparck-Jones, S. E. Robertson, and M. Sanderson. Ambiguous requests: implications for retrieval tests, systems and theories. SIGIR Forum, 41(2):8--17, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. J. Wang and J. Zhu. Portfolio theory of information retrieval. In Proc. of SIGIR, pages 115--122, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. J. Yi and F. Maghoul. Query clustering using click-through graph. In Proc. of WWW, pages 1055--1056, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. H.-J. Zeng, Q.-C. He, Z. Chen, W.-Y. Ma, and J. Ma. Learning to cluster Web search results. In Proc. of SIGIR, pages 210--217, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. C. Zhai, W. W. Cohen, and J. Lafferty. Beyond independent relevance: methods and evaluation metrics for subtopic retrieval. In Proc. of SIGIR, pages 10--17, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Exploiting query reformulations for web search result diversification

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      WWW '10: Proceedings of the 19th international conference on World wide web
      April 2010
      1407 pages
      ISBN:9781605587998
      DOI:10.1145/1772690

      Copyright © 2010 International World Wide Web Conference Committee (IW3C2)

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 26 April 2010

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate1,899of8,196submissions,23%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    ePub

    View this article in ePub.

    View ePub