skip to main content
10.1145/1772690.1772770acmotherconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

Diversifying web search results

Published: 26 April 2010 Publication History

Abstract

Result diversity is a topic of great importance as more facets of queries are discovered and users expect to find their desired facets in the first page of the results. However, the underlying questions of how 'diversity' interplays with 'quality' and when preference should be given to one or both are not well-understood. In this work, we model the problem as expectation maximization and study the challenges of estimating the model parameters and reaching an equilibrium. One model parameter, for example, is correlations between pages which we estimate using textual contents of pages and click data (when available). We conduct experiments on diversifying randomly selected queries from a query log and the queries chosen from the disambiguation topics of Wikipedia. Our algorithm improves upon Google in terms of the diversity of random queries, retrieving 14% to 38% more aspects of queries in top 5, while maintaining a precision very close to Google. On a more selective set of queries that are expected to benefit from diversification, our algorithm improves upon Google in terms of precision and diversity of the results, and significantly outperforms another baseline system for result diversification.

References

[1]
R. Agrawal, S. Gollapudi, A. Halverson, and S. Leong. Diversifying search results. In Proc. of ACM Conf. on Web Search and Data Mining, 2009.
[2]
R. Baeza-Yates and B. Ribeiro-Neto. Modern Information Retrieval. Addison Wesley, 1999.
[3]
R. Bhatia. Positive Definite Matrices. Princeton University Press, 2006.
[4]
T. Brants and A. Franz. Web 1t 5-gram version 1. Linguistic Data Consortium, Philadelphia, 2006.
[5]
J. Carbonell and J. Goldstein. The use of MMR, diversity-based reranking for reordering documents and producing summaries. In Proc. of SIGIR Posters, 1998.
[6]
H. Chen and D. Karger. Less is more: probabilistic models for retrieving fewer relevant documents. In Proc. of SIGIR Conf., 2006.
[7]
C. Clarke, M. Kolla, G. Cormack, O. Vechtomova, A. Ashkan, S. Buttcher, and I. MacKinnon. Novelty and diversity in information retrieval evaluation. In Proc. of SIGIR Conf., pages 659--666, 2008.
[8]
R. Fletcher. Practical methods of optimization. Wiley and Sons, second edition, 1987.
[9]
M. Gertz and S. Wright. Object-oriented software for quadratic programming (ooqp). http://pages.cs.wisc/edu/ swright/ooqp.
[10]
H. Craswell, C. Clarke, I. Soboroff. TREC 2009 novelty track. In Proc. of TREC, 2009.
[11]
H. Markowitz. Portfolio selection. The Journal of Finance, VII(1):77--91, 1952.
[12]
J. Nocedal and S. Wright. Numerical optimization. Springer, second edition, 2006.
[13]
G. Pass, A. Chowdhury, and C. Torgeson. A picture of search. In The 1st Intl. Conf. on Scalable Information Systems, 2006.
[14]
F. Radlinski and S. Dumais. Improving personalized web search using result diversification. In Proc. of SIGIR Conf. (Poster Session), 2006.
[15]
M. Richardson, E. Dominowska, and R. Ragno. Predicting clicks: estimating the click-through rate for new ads. In Proc. of WWW Conf., pages 521--529, 2007.
[16]
J. Teevan, E. Adar, R. Jones, and M. Potts. Information re-retrieval: repeat queries in yahoos logs. In Proc. of SIGIR Conf., pages 151--158, 2007.
[17]
E. Vee, U. Srivastava, J. Shanmugasundaram, P. Bhat, and S. A. Yahia. Efficient computation of diverse query results. In Proc. of the ICDE Conf., pages 228--236, 2008.
[18]
J. Wang and J. Zhu. Portfolio theory of information retrieval. In Proc. of SIGIR Conf., pages 115--122, 2009.
[19]
Wikipedia. http://en.wikipedia.org.
[20]
C. Zhai, W. Cohen, and J. Lafferty. Beyond independent relevance: methods and evaluation metrics for subtopic retrieval. In Proc. of SIGIR Conf., 2003.
[21]
C. Zhai and J. Lafferty. A risk minimization framework for information retrieval. In Proc. of SIGIR Workshop on Mathematical/Formal Methods in IR, 2003.
[22]
B. Zhang, H. Li, Y. Liu, L. Ji,W. Xi, W. Fan, Z. Chen, and W. Ma. Improving web search results using affinity graph. In Proc. of SIGIR Conf., 2005.
[23]
R. Zwol, V. Murdock, L. Pueyo, and G. Ramirez. Diversifying image search with user generated content. In Proc. of the 1st ACM Conf. on Multimedia IR, pages 67--74, 2008.

Cited By

View all
  • (2024)Rethinking 'Complement' Recommendations at Scale with SIMDProceedings of the 15th ACM/SPEC International Conference on Performance Engineering10.1145/3629526.3645041(25-36)Online publication date: 7-May-2024
  • (2023)Fair Max–Min Diversity Maximization in Streaming and Sliding-Window ModelsEntropy10.3390/e2507106625:7(1066)Online publication date: 14-Jul-2023
  • (2023)Equitable Top-k Results for Long Tail DataProceedings of the ACM on Management of Data10.1145/36267271:4(1-24)Online publication date: 12-Dec-2023
  • Show More Cited By

Index Terms

  1. Diversifying web search results

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    WWW '10: Proceedings of the 19th international conference on World wide web
    April 2010
    1407 pages
    ISBN:9781605587998
    DOI:10.1145/1772690

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 26 April 2010

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. query ambiguity
    2. result diversity
    3. search diversity

    Qualifiers

    • Research-article

    Conference

    WWW '10
    WWW '10: The 19th International World Wide Web Conference
    April 26 - 30, 2010
    North Carolina, Raleigh, USA

    Acceptance Rates

    Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)20
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 15 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Rethinking 'Complement' Recommendations at Scale with SIMDProceedings of the 15th ACM/SPEC International Conference on Performance Engineering10.1145/3629526.3645041(25-36)Online publication date: 7-May-2024
    • (2023)Fair Max–Min Diversity Maximization in Streaming and Sliding-Window ModelsEntropy10.3390/e2507106625:7(1066)Online publication date: 14-Jul-2023
    • (2023)Equitable Top-k Results for Long Tail DataProceedings of the ACM on Management of Data10.1145/36267271:4(1-24)Online publication date: 12-Dec-2023
    • (2023)CAViaR: Context Aware Video RecommendationsCompanion Proceedings of the ACM Web Conference 202310.1145/3543873.3584658(518-522)Online publication date: 30-Apr-2023
    • (2023)Modeling Global-Local Subtopic Distribution with Hypergraph to Diversify Search Results2023 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN54540.2023.10191529(1-8)Online publication date: 18-Jun-2023
    • (2023)Efficient Diversification for Recommending Aggregate Data VisualizationsIEEE Access10.1109/ACCESS.2023.328345711(62261-62280)Online publication date: 2023
    • (2022)Streaming Algorithms for Diversity Maximization with Fairness Constraints2022 IEEE 38th International Conference on Data Engineering (ICDE)10.1109/ICDE53745.2022.00008(41-53)Online publication date: May-2022
    • (2022)Provable randomized rounding for minimum-similarity diversificationData Mining and Knowledge Discovery10.1007/s10618-021-00811-236:2(709-738)Online publication date: 4-Jan-2022
    • (2022)Comprehensive Information Retrieval Using Fine-Tuned Bert Model and Topic-Assisted Query ExpansionAmbient Intelligence in Health Care10.1007/978-981-19-6068-0_12(117-132)Online publication date: 23-Nov-2022
    • (2022)Diversify Search Results Through Graph Attentive Document InteractionDatabase Systems for Advanced Applications10.1007/978-3-031-00123-9_51(632-647)Online publication date: 8-Apr-2022
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    EPUB

    View this article in ePub.

    ePub

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media