research-article

Intent-aware search result diversification

Authors:

Rodrygo L.T. Santos,

Craig Macdonald,

Iadh OunisAuthors Info & Claims

SIGIR '11: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval

Pages 595 - 604

https://doi.org/10.1145/2009916.2009997

Published: 24 July 2011 Publication History

Abstract

Search result diversification has gained momentum as a way to tackle ambiguous queries. An effective approach to this problem is to explicitly model the possible aspects underlying a query, in order to maximise the estimated relevance of the retrieved documents with respect to the different aspects. However, such aspects themselves may represent information needs with rather distinct intents (e.g., informational or navigational). Hence, a diverse ranking could benefit from applying intent-aware retrieval models when estimating the relevance of documents to different aspects. In this paper, we propose to diversify the results retrieved for a given query, by learning the appropriateness of different retrieval models for each of the aspects underlying this query. Thorough experiments within the evaluation framework provided by the diversity task of the TREC 2009 and 2010 Web tracks show that the proposed approach can significantly improve state-of-the-art diversification approaches.

References

[1]

R. Agrawal, S. Gollapudi, A. Halverson, and S. Ieong. Diversifying search results. In WSDM, pages 5--14, 2009.

Digital Library

[2]

G. Amati, C. Carpineto, G. Romano, and F. U. Bordoni. Query difficulty, robustness and selective application of query expansion. In ECIR, pages 127--137, 2004.

[3]

L. Becchetti, C. Castillo, D. Donato, S. Leonardi, and R. Baeza-Yates. Link-based characterization and detection of Web spam. In AIRWeb, 2006.

[4]

A. Broder. A taxonomy of Web search. SIGIR Forum, 36(2):3--10, 2002.

Digital Library

[5]

J. Carbonell and J. Goldstein. The use of MMR, diversity-based reranking for reordering documents and producing summaries. In SIGIR, pages 335--336, 1998.

Digital Library

[6]

D. Carmel and E. Yom-Tov. Estimating the query difficulty for information retrieval. In SIGIR, page 911, 2010.

Digital Library

[7]

B. Carterette and P. Chandar. Probabilistic models of ranking novel documents for faceted topic retrieval. In CIKM, pages 1287--1296, 2009.

Digital Library

[8]

B. Carterette, V. Pavluz, H. Fangx, and E. Kanoulas. Million Query track 2009 overview. In TREC, 2009.

[9]

O. Chapelle, D. Metlzer, Y. Zhang, and P. Grinspan. Expected reciprocal rank for graded relevance. In CIKM, pages 621--630, 2009.

Digital Library

[10]

H. Chen and D. R. Karger. Less is more: Probabilistic models for retrieving fewer relevant documents. In SIGIR, pages 429--436, 2006.

Digital Library

[11]

C. L. A. Clarke, N. Craswell, and I. Soboroff. Overview of the TREC 2009 Web track. In TREC, 2009.

[12]

C. L. A. Clarke, N. Craswell, I. Soboroff, and G. V. Cormack. Overview of the TREC 2010 Web track. In TREC, 2010.

[13]

C. L. A. Clarke, M. Kolla, G. V. Cormack, O. Vechtomova, A. Ashkan, S. Büttcher, and I. MacKinnon. Novelty and diversity in information retrieval evaluation. In SIGIR, pages 659--666, 2008.

Digital Library

[14]

C. L. A. Clarke, M. Kolla, and O. Vechtomova. An effectiveness measure for ambiguous and underspecified queries. In ICTIR, pages 188--199, 2009.

Digital Library

[15]

P. Clough, M. Sanderson, M. Abouammoh, S. Navarro, and M. Paramita. Multiple approaches to analysing query diversity. In SIGIR, pages 734--735, 2009.

Digital Library

[16]

G. V. Cormack, M. D. Smucker, and C. L. A. Clarke. Efficient and effective spam filtering and re-ranking for large Web datasets. Inf. Retr., 2011.

Digital Library

[17]

N. Craswell and D. Hawking. Overview of the TREC 2004 Web track. In TREC, 2004.

[18]

N. Craswell, S. Robertson, H. Zaragoza, and M. Taylor. Relevance weighting for query independent evidence. In SIGIR, pages 416--423, 2005.

Digital Library

[19]

S. Cronen-Townsend, Y. Zhou, and W. B. Croft. Predicting query performance. In SIGIR, pages 299--306, 2002.

Digital Library

[20]

X. Geng, T.-Y. Liu, T. Qin, A. Arnold, H. Li, and H.-Y. Shum. Query dependent ranking using k-nearest neighbor. In SIGIR, pages 115--122, 2008.

Digital Library

[21]

B. He and I. Ounis. Query performance prediction. Inf. Syst., 31(7):585--594, 2006.

Digital Library

[22]

I.-H. Kang and G. Kim. Query type classification for Web document retrieval. In SIGIR, pages 64--71, 2003.

Digital Library

[23]

S. Kirkpatrick, C. D. Gelatt, and M. P. Vecchi. Optimization by simulated annealing. Science, 220(4598):671--680, 1983.

[24]

D. A. Metzler. Automatic feature selection in the Markov random field model for information retrieval. In CIKM, pages 253--262, 2007.

Digital Library

[25]

I. Ounis, G. Amati, V. Plachouras, B. He, C. Macdonald, and C. Lioma. Terrier: A high performance and scalable information retrieval platform. In SIGIR, OSIR Workshop, 2006.

[26]

L. Page, S. Brin, R. Motwani, and T. Winograd. The PageRank citation ranking: Bringing order to the Web. Technical Report 1999--66, Stanford, 1999.

[27]

J. Peng, C. Macdonald, and I. Ounis. Learning to select a ranking function. In ECIR, pages 114--126, 2010.

Digital Library

[28]

V. Plachouras, I. Ounis, and G. Amati. The static absorbing model for the Web. J. Web Eng., 4(2):165--186, 2005.

Digital Library

[29]

T. Qin, T.-Y. Liu, J. Xu, and H. Li. LETOR: A benchmark collection for research on learning to rank for information retrieval. Inf. Retr., 13(4):346--374, 2010.

Digital Library

[30]

S. E. Robertson. The probability ranking principle in IR. J. Doc., 33(4):294--304, 1977.

[31]

D. E. Rose and D. Levinson. Understanding user goals in Web search. In WWW, pages 13--19, 2004.

Digital Library

[32]

M. Sanderson. Ambiguous queries: Test collections need more sense. In SIGIR, pages 499--506, 2008.

Digital Library

[33]

R. L. T. Santos, C. Macdonald, and I. Ounis. Exploiting query reformulations for Web search result diversification. In WWW, pages 881--890, 2010.

Digital Library

[34]

R. L. T. Santos, C. Macdonald, and I. Ounis. Selectively diversifying Web search results. In CIKM, pages 1179--1188, 2010.

Digital Library

[35]

R. L. T. Santos and I. Ounis. Diversifying for multiple information needs. In ECIR, DDR Workshop, pages 37--41, 2011.

[36]

R. L. T. Santos, J. Peng, C. Macdonald, and I. Ounis. Explicit search result diversification through sub-queries. In ECIR, pages 87--99, 2010.

Digital Library

[37]

R. Song, Z. Luo, J.-Y. Nie, Y. Yu, and H.-W. Hon. Identification of ambiguous queries in Web search. Inf. Process. Manage., 45(2):216--229, 2009.

Digital Library

[38]

K. Spärck-Jones, S. E. Robertson, and M. Sanderson. Ambiguous requests: Implications for retrieval tests, systems and theories. SIGIR Forum, 41(2):8--17, 2007.

Digital Library

[39]

J. Wang and J. Zhu. Portfolio theory of information retrieval. In SIGIR, pages 115--122, 2009.

Digital Library

[40]

I. H. Witten and E. Frank. Data Mining: Practical Machine Learning Tools. 2005.

Digital Library

[41]

C. Zhai, W. W. Cohen, and J. Lafferty. Beyond independent relevance: Methods and evaluation metrics for subtopic retrieval. In SIGIR, pages 10--17, 2003.

Digital Library

[42]

Y. Zhou and W. B. Croft. Query performance prediction in web search environments. In SIGIR, pages 543--550, 2007.

Digital Library

Cited By

Zhuang SZuccon GDiaz FShah CSuel TCastells PJones RSakai T(2021)How do Online Learning to Rank Methods Adapt to Changes of Intent?Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3404835.3462937(911-920)Online publication date: 11-Jul-2021
https://dl.acm.org/doi/10.1145/3404835.3462937
Tonellotto NMacdonald C(2020)Using an Inverted Index Synopsis for Query Latency and Performance PredictionACM Transactions on Information Systems10.1145/338979538:3(1-33)Online publication date: 18-May-2020
https://dl.acm.org/doi/10.1145/3389795
Shajalal MAono M(2020)Health Information RetrievalSignal Processing Techniques for Computational Health Informatics10.1007/978-3-030-54932-9_8(193-207)Online publication date: 8-Oct-2020
https://doi.org/10.1007/978-3-030-54932-9_8
Show More Cited By

Index Terms

Intent-aware search result diversification
1. Information systems
  1. Information retrieval
    1. Retrieval models and ranking

Recommendations

Exploiting query reformulations for web search result diversification
WWW '10: Proceedings of the 19th international conference on World wide web

When a Web user's underlying information need is not clearly specified from the initial query, an effective approach is to diversify the results retrieved for this query. In this paper, we introduce a novel probabilistic framework for Web search result ...
On the role of novelty for search result diversification
Abstract
Re-ranking the search results in order to promote novel ones has traditionally been regarded as an intuitive diversification strategy. In this paper, we challenge this common intuition and thoroughly investigate the actual role of novelty for ...
Intent-based diversification of web search results: metrics and algorithms

We study the problem of web search result diversification in the case where intent based relevance scores are available. A diversified search result will hopefully satisfy the information need of user-L.s who may have different intents. In this context, ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGIR '11: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval

July 2011

1374 pages

ISBN:9781450307574

DOI:10.1145/2009916

General Chairs:
Wei-Ying Ma
Microsoft Research Asia, China
,
Jian-Yun Nie
University of Montreal, Canada
,
Program Chairs:
Ricardo Baeza-Yates
Yahoo! Research, Spain
,
Tat-Seng Chua
National University of Singapore
,
W. Bruce Croft
University of Massachusetts, Amherst, USA

Copyright © 2011 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGIR: ACM Special Interest Group on Information Retrieval

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 July 2011

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

SIGIR '11

Sponsor:

SIGIR

SIGIR '11: The 34th International ACM SIGIR conference on research and development in Information Retrieval

July 24 - 28, 2011

Beijing, China

Acceptance Rates

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

60
Total Citations
View Citations
874
Total Downloads

Downloads (Last 12 months)39
Downloads (Last 6 weeks)6

Reflects downloads up to 20 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Zhuang SZuccon GDiaz FShah CSuel TCastells PJones RSakai T(2021)How do Online Learning to Rank Methods Adapt to Changes of Intent?Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3404835.3462937(911-920)Online publication date: 11-Jul-2021
https://dl.acm.org/doi/10.1145/3404835.3462937
Tonellotto NMacdonald C(2020)Using an Inverted Index Synopsis for Query Latency and Performance PredictionACM Transactions on Information Systems10.1145/338979538:3(1-33)Online publication date: 18-May-2020
https://dl.acm.org/doi/10.1145/3389795
Shajalal MAono M(2020)Health Information RetrievalSignal Processing Techniques for Computational Health Informatics10.1007/978-3-030-54932-9_8(193-207)Online publication date: 8-Oct-2020
https://doi.org/10.1007/978-3-030-54932-9_8
Wu ZZhou KLiu YZhang MMa S(2019)Does Diversity Affect User Satisfaction in Image SearchACM Transactions on Information Systems10.1145/332011837:3(1-30)Online publication date: 8-May-2019
https://dl.acm.org/doi/10.1145/3320118
Dou ZYang XLi DWen JSakai T(2019)Low-cost, bottom-up measures for evaluating search result diversificationInformation Retrieval Journal10.1007/s10791-019-09356-xOnline publication date: 20-Apr-2019
https://doi.org/10.1007/s10791-019-09356-x
Maxwell DAzzopardi LMoshfeghi Y(2019)The impact of result diversification on search behaviour and performanceInformation Retrieval Journal10.1007/s10791-019-09353-0Online publication date: 16-May-2019
https://doi.org/10.1007/s10791-019-09353-0
Li CChen SXing JSun AMa Z(2018)Seed-Guided Topic Model for Document Filtering and ClassificationACM Transactions on Information Systems10.1145/323825037:1(1-37)Online publication date: 6-Dec-2018
https://dl.acm.org/doi/10.1145/3238250
Wang XWen JDou ZSakai TZhang R(2018)Search Result Diversity Evaluation Based on Intent HierarchiesIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2017.272955930:1(156-169)Online publication date: 1-Jan-2018
https://doi.org/10.1109/TKDE.2017.2729559
Meng ZShen H(2018)Scalable Aspects Learning for Intent-Aware Diversified Search on Social NetworksIEEE Access10.1109/ACCESS.2018.28509356(37124-37137)Online publication date: 2018
https://doi.org/10.1109/ACCESS.2018.2850935
Ren PChen ZMa JWang SZhang ZRen ZMa T(2018)User session level diverse reranking of search resultsNeurocomputing10.1016/j.neucom.2016.05.087274:C(66-79)Online publication date: 24-Jan-2018
https://dl.acm.org/doi/10.1016/j.neucom.2016.05.087
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten