skip to main content
10.1145/2009916.2009997acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

Intent-aware search result diversification

Published:24 July 2011Publication History

ABSTRACT

Search result diversification has gained momentum as a way to tackle ambiguous queries. An effective approach to this problem is to explicitly model the possible aspects underlying a query, in order to maximise the estimated relevance of the retrieved documents with respect to the different aspects. However, such aspects themselves may represent information needs with rather distinct intents (e.g., informational or navigational). Hence, a diverse ranking could benefit from applying intent-aware retrieval models when estimating the relevance of documents to different aspects. In this paper, we propose to diversify the results retrieved for a given query, by learning the appropriateness of different retrieval models for each of the aspects underlying this query. Thorough experiments within the evaluation framework provided by the diversity task of the TREC 2009 and 2010 Web tracks show that the proposed approach can significantly improve state-of-the-art diversification approaches.

References

  1. R. Agrawal, S. Gollapudi, A. Halverson, and S. Ieong. Diversifying search results. In WSDM, pages 5--14, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. G. Amati, C. Carpineto, G. Romano, and F. U. Bordoni. Query difficulty, robustness and selective application of query expansion. In ECIR, pages 127--137, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  3. L. Becchetti, C. Castillo, D. Donato, S. Leonardi, and R. Baeza-Yates. Link-based characterization and detection of Web spam. In AIRWeb, 2006.Google ScholarGoogle Scholar
  4. A. Broder. A taxonomy of Web search. SIGIR Forum, 36(2):3--10, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. J. Carbonell and J. Goldstein. The use of MMR, diversity-based reranking for reordering documents and producing summaries. In SIGIR, pages 335--336, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. D. Carmel and E. Yom-Tov. Estimating the query difficulty for information retrieval. In SIGIR, page 911, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. B. Carterette and P. Chandar. Probabilistic models of ranking novel documents for faceted topic retrieval. In CIKM, pages 1287--1296, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. B. Carterette, V. Pavluz, H. Fangx, and E. Kanoulas. Million Query track 2009 overview. In TREC, 2009.Google ScholarGoogle Scholar
  9. O. Chapelle, D. Metlzer, Y. Zhang, and P. Grinspan. Expected reciprocal rank for graded relevance. In CIKM, pages 621--630, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. H. Chen and D. R. Karger. Less is more: Probabilistic models for retrieving fewer relevant documents. In SIGIR, pages 429--436, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. C. L. A. Clarke, N. Craswell, and I. Soboroff. Overview of the TREC 2009 Web track. In TREC, 2009.Google ScholarGoogle Scholar
  12. C. L. A. Clarke, N. Craswell, I. Soboroff, and G. V. Cormack. Overview of the TREC 2010 Web track. In TREC, 2010.Google ScholarGoogle Scholar
  13. C. L. A. Clarke, M. Kolla, G. V. Cormack, O. Vechtomova, A. Ashkan, S. Büttcher, and I. MacKinnon. Novelty and diversity in information retrieval evaluation. In SIGIR, pages 659--666, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. C. L. A. Clarke, M. Kolla, and O. Vechtomova. An effectiveness measure for ambiguous and underspecified queries. In ICTIR, pages 188--199, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. P. Clough, M. Sanderson, M. Abouammoh, S. Navarro, and M. Paramita. Multiple approaches to analysing query diversity. In SIGIR, pages 734--735, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. G. V. Cormack, M. D. Smucker, and C. L. A. Clarke. Efficient and effective spam filtering and re-ranking for large Web datasets. Inf. Retr., 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. N. Craswell and D. Hawking. Overview of the TREC 2004 Web track. In TREC, 2004.Google ScholarGoogle Scholar
  18. N. Craswell, S. Robertson, H. Zaragoza, and M. Taylor. Relevance weighting for query independent evidence. In SIGIR, pages 416--423, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. S. Cronen-Townsend, Y. Zhou, and W. B. Croft. Predicting query performance. In SIGIR, pages 299--306, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. X. Geng, T.-Y. Liu, T. Qin, A. Arnold, H. Li, and H.-Y. Shum. Query dependent ranking using k-nearest neighbor. In SIGIR, pages 115--122, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. B. He and I. Ounis. Query performance prediction. Inf. Syst., 31(7):585--594, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. I.-H. Kang and G. Kim. Query type classification for Web document retrieval. In SIGIR, pages 64--71, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. S. Kirkpatrick, C. D. Gelatt, and M. P. Vecchi. Optimization by simulated annealing. Science, 220(4598):671--680, 1983.Google ScholarGoogle ScholarCross RefCross Ref
  24. D. A. Metzler. Automatic feature selection in the Markov random field model for information retrieval. In CIKM, pages 253--262, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. I. Ounis, G. Amati, V. Plachouras, B. He, C. Macdonald, and C. Lioma. Terrier: A high performance and scalable information retrieval platform. In SIGIR, OSIR Workshop, 2006.Google ScholarGoogle Scholar
  26. L. Page, S. Brin, R. Motwani, and T. Winograd. The PageRank citation ranking: Bringing order to the Web. Technical Report 1999--66, Stanford, 1999.Google ScholarGoogle Scholar
  27. J. Peng, C. Macdonald, and I. Ounis. Learning to select a ranking function. In ECIR, pages 114--126, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. V. Plachouras, I. Ounis, and G. Amati. The static absorbing model for the Web. J. Web Eng., 4(2):165--186, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. T. Qin, T.-Y. Liu, J. Xu, and H. Li. LETOR: A benchmark collection for research on learning to rank for information retrieval. Inf. Retr., 13(4):346--374, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. S. E. Robertson. The probability ranking principle in IR. J. Doc., 33(4):294--304, 1977.Google ScholarGoogle ScholarCross RefCross Ref
  31. D. E. Rose and D. Levinson. Understanding user goals in Web search. In WWW, pages 13--19, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. M. Sanderson. Ambiguous queries: Test collections need more sense. In SIGIR, pages 499--506, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. R. L. T. Santos, C. Macdonald, and I. Ounis. Exploiting query reformulations for Web search result diversification. In WWW, pages 881--890, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. R. L. T. Santos, C. Macdonald, and I. Ounis. Selectively diversifying Web search results. In CIKM, pages 1179--1188, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. R. L. T. Santos and I. Ounis. Diversifying for multiple information needs. In ECIR, DDR Workshop, pages 37--41, 2011.Google ScholarGoogle Scholar
  36. R. L. T. Santos, J. Peng, C. Macdonald, and I. Ounis. Explicit search result diversification through sub-queries. In ECIR, pages 87--99, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. R. Song, Z. Luo, J.-Y. Nie, Y. Yu, and H.-W. Hon. Identification of ambiguous queries in Web search. Inf. Process. Manage., 45(2):216--229, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. K. Spärck-Jones, S. E. Robertson, and M. Sanderson. Ambiguous requests: Implications for retrieval tests, systems and theories. SIGIR Forum, 41(2):8--17, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. J. Wang and J. Zhu. Portfolio theory of information retrieval. In SIGIR, pages 115--122, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. I. H. Witten and E. Frank. Data Mining: Practical Machine Learning Tools. 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. C. Zhai, W. W. Cohen, and J. Lafferty. Beyond independent relevance: Methods and evaluation metrics for subtopic retrieval. In SIGIR, pages 10--17, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Y. Zhou and W. B. Croft. Query performance prediction in web search environments. In SIGIR, pages 543--550, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Intent-aware search result diversification

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      SIGIR '11: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
      July 2011
      1374 pages
      ISBN:9781450307574
      DOI:10.1145/2009916

      Copyright © 2011 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 24 July 2011

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate792of3,983submissions,20%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader