skip to main content
10.1145/1835449.1835683acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
tutorial

Estimating the query difficulty for information retrieval

Published: 19 July 2010 Publication History

Abstract

Many information retrieval (IR) systems suffer from a radical variance in performance when responding to users' queries. Even for systems that succeed very well on average, the quality of results returned for some of the queries is poor. Thus, it is desirable that IR systems will be able to identify "difficult" queries in order to handle them properly. Understanding why some queries are inherently more difficult than others is essential for IR, and a good answer to this important question will help search engines to reduce the variance in performance, hence better servicing their customer needs.
The high variability in query performance has driven a new research direction in the IR field on estimating the expected quality of the search results, i.e. the query difficulty, when no relevance feedback is given. Estimating the query difficulty is a significant challenge due to the numerous factors that impact retrieval performance. Many prediction methods have been proposed recently. However, as many researchers observed, the prediction quality of state-of-the-art predictors is still too low to be widely used by IR applications. The low prediction quality is due to the complexity of the task, which involves factors such as query ambiguity, missing content, and vocabulary mismatch.
The goal of this tutorial is to expose participants to the current research on query performance prediction (also known as query difficulty estimation). Participants will become familiar with states-of-the-art performance prediction methods, and with common evaluation methodologies for prediction quality. We will discuss the reasons that cause search engines to fail for some of the queries, and provide an overview of several approaches for estimating query difficulty. We then describe common methodologies for evaluating the prediction quality of those estimators, and some experiments conducted recently with their prediction quality, as measured over several TREC benchmarks. We will cover a few potential applications that can utilize query difficulty estimators by handling each query individually and selectively based on its estimated difficulty. Finally we will summarize with a discussion on open issues and challenges in the field.

Cited By

View all
  • (2023)Sentiment Difficulty in Aspect-Based Sentiment AnalysisMathematics10.3390/math1122464711:22(4647)Online publication date: 14-Nov-2023
  • (2023)MS-Shift: An Analysis of MS MARCO Distribution Shifts on Neural RetrievalAdvances in Information Retrieval10.1007/978-3-031-28244-7_40(636-652)Online publication date: 2-Apr-2023
  • (2022)A Relative Information Gain-based Query Performance Prediction Framework with Generated Query VariantsACM Transactions on Information Systems10.1145/354511241:2(1-31)Online publication date: 21-Dec-2022
  • Show More Cited By

Index Terms

  1. Estimating the query difficulty for information retrieval

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGIR '10: Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
    July 2010
    944 pages
    ISBN:9781450301534
    DOI:10.1145/1835449
    Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 19 July 2010

    Check for updates

    Author Tags

    1. performance prediction
    2. query difficulty estimation
    3. retrieval robustness

    Qualifiers

    • Tutorial

    Conference

    SIGIR '10
    Sponsor:

    Acceptance Rates

    SIGIR '10 Paper Acceptance Rate 87 of 520 submissions, 17%;
    Overall Acceptance Rate 792 of 3,983 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)16
    • Downloads (Last 6 weeks)4
    Reflects downloads up to 09 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Sentiment Difficulty in Aspect-Based Sentiment AnalysisMathematics10.3390/math1122464711:22(4647)Online publication date: 14-Nov-2023
    • (2023)MS-Shift: An Analysis of MS MARCO Distribution Shifts on Neural RetrievalAdvances in Information Retrieval10.1007/978-3-031-28244-7_40(636-652)Online publication date: 2-Apr-2023
    • (2022)A Relative Information Gain-based Query Performance Prediction Framework with Generated Query VariantsACM Transactions on Information Systems10.1145/354511241:2(1-31)Online publication date: 21-Dec-2022
    • (2022)From Easy to HardProceedings of the 31st ACM International Conference on Information & Knowledge Management10.1145/3511808.3557328(2784-2794)Online publication date: 17-Oct-2022
    • (2022)A semantic approach to post-retrieval query performance predictionInformation Processing and Management: an International Journal10.1016/j.ipm.2021.10274659:1Online publication date: 1-Jan-2022
    • (2022)An Analysis of Variations in the Effectiveness of Query Performance PredictionAdvances in Information Retrieval10.1007/978-3-030-99736-6_15(215-229)Online publication date: 5-Apr-2022
    • (2021)Effective Query Formulation in Conversation Contextualization: A Query Specificity-based ApproachProceedings of the 2021 ACM SIGIR International Conference on Theory of Information Retrieval10.1145/3471158.3472237(177-183)Online publication date: 11-Jul-2021
    • (2021)Towards Meaningful Statements in IR Evaluation: Mapping Evaluation Measures to Interval ScalesIEEE Access10.1109/ACCESS.2021.31168579(136182-136216)Online publication date: 2021
    • (2020)Feature Extraction for Large-Scale Text CollectionsProceedings of the 29th ACM International Conference on Information & Knowledge Management10.1145/3340531.3412773(3015-3022)Online publication date: 19-Oct-2020
    • (2020)Supervised approaches for explicit search result diversificationInformation Processing & Management10.1016/j.ipm.2020.10235657:6(102356)Online publication date: Nov-2020
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media