tutorial

Estimating the query difficulty for information retrieval

Authors:

SIGIR '10: Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval

Page 911

https://doi.org/10.1145/1835449.1835683

Published: 19 July 2010 Publication History

Get Access

Abstract

Many information retrieval (IR) systems suffer from a radical variance in performance when responding to users' queries. Even for systems that succeed very well on average, the quality of results returned for some of the queries is poor. Thus, it is desirable that IR systems will be able to identify "difficult" queries in order to handle them properly. Understanding why some queries are inherently more difficult than others is essential for IR, and a good answer to this important question will help search engines to reduce the variance in performance, hence better servicing their customer needs.

The high variability in query performance has driven a new research direction in the IR field on estimating the expected quality of the search results, i.e. the query difficulty, when no relevance feedback is given. Estimating the query difficulty is a significant challenge due to the numerous factors that impact retrieval performance. Many prediction methods have been proposed recently. However, as many researchers observed, the prediction quality of state-of-the-art predictors is still too low to be widely used by IR applications. The low prediction quality is due to the complexity of the task, which involves factors such as query ambiguity, missing content, and vocabulary mismatch.

The goal of this tutorial is to expose participants to the current research on query performance prediction (also known as query difficulty estimation). Participants will become familiar with states-of-the-art performance prediction methods, and with common evaluation methodologies for prediction quality. We will discuss the reasons that cause search engines to fail for some of the queries, and provide an overview of several approaches for estimating query difficulty. We then describe common methodologies for evaluating the prediction quality of those estimators, and some experiments conducted recently with their prediction quality, as measured over several TREC benchmarks. We will cover a few potential applications that can utilize query difficulty estimators by handling each query individually and selectively based on its estimated difficulty. Finally we will summarize with a discussion on open issues and challenges in the field.

Cited By

View all

Chifu AFournier S(2023)Sentiment Difficulty in Aspect-Based Sentiment AnalysisMathematics10.3390/math1122464711:22(4647)Online publication date: 14-Nov-2023
https://doi.org/10.3390/math11224647
Lupart SFormal TClinchant S(2023)MS-Shift: An Analysis of MS MARCO Distribution Shifts on Neural RetrievalAdvances in Information Retrieval10.1007/978-3-031-28244-7_40(636-652)Online publication date: 2-Apr-2023
https://dl.acm.org/doi/10.1007/978-3-031-28244-7_40
Datta SGanguly DMitra MGreene D(2022)A Relative Information Gain-based Query Performance Prediction Framework with Generated Query VariantsACM Transactions on Information Systems10.1145/354511241:2(1-31)Online publication date: 21-Dec-2022
https://dl.acm.org/doi/10.1145/3545112
Show More Cited By

Index Terms

Estimating the query difficulty for information retrieval
1. Information systems
  1. Information retrieval
    1. Retrieval models and ranking

Recommendations

Estimating the Query Difficulty for Information Retrieval
Estimating query difficulty for news prediction retrieval
CIKM '12: Proceedings of the 21st ACM international conference on Information and knowledge management

News prediction retrieval has recently emerged as the task of retrieving predictions related to a given news story (or a query). Predictions are defined as sentences containing time references to future events. Such future-related information is ...
Query difficulty estimation via relevance prediction for image retrieval

Query difficulty estimation (QDE) attempts to automatically predict the performance of the search results returned for a given query. QDE has long been of interest in text retrieval. However, few research works have been conducted in image retrieval. ...

Comments

Information & Contributors

Information

Published In

SIGIR '10: Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval

July 2010

944 pages

ISBN:9781450301534

DOI:10.1145/1835449

General Chairs:
Fabio Crestani
University of Lugano, CH
,
Stéphane Marchand-Maillet
University of Geneva, CH
,
Program Chairs:
Hsin-Hsi Chen
National Taiwan University, TW
,
Efthimis N. Efthimiadis
University of Washington, USA
,
Jacques Savoy
University of Neuchatel, CH

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 July 2010

Check for updates

Author Tags

Qualifiers

Tutorial

Conference

SIGIR '10

Sponsor:

SIGIR

SIGIR '10: The 33rd International ACM SIGIR conference on research and development in Information Retrieval

July 19 - 23, 2010

Geneva, Switzerland

Acceptance Rates

SIGIR '10 Paper Acceptance Rate 87 of 520 submissions, 17%;

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

41
Total Citations
View Citations
569
Total Downloads

Downloads (Last 12 months)16
Downloads (Last 6 weeks)4

Reflects downloads up to 09 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

View all

Chifu AFournier S(2023)Sentiment Difficulty in Aspect-Based Sentiment AnalysisMathematics10.3390/math1122464711:22(4647)Online publication date: 14-Nov-2023
https://doi.org/10.3390/math11224647
Lupart SFormal TClinchant S(2023)MS-Shift: An Analysis of MS MARCO Distribution Shifts on Neural RetrievalAdvances in Information Retrieval10.1007/978-3-031-28244-7_40(636-652)Online publication date: 2-Apr-2023
https://dl.acm.org/doi/10.1007/978-3-031-28244-7_40
Datta SGanguly DMitra MGreene D(2022)A Relative Information Gain-based Query Performance Prediction Framework with Generated Query VariantsACM Transactions on Information Systems10.1145/354511241:2(1-31)Online publication date: 21-Dec-2022
https://dl.acm.org/doi/10.1145/3545112
Zhu YNie JSu YChen HZhang XDou ZAl Hasan MXiong L(2022)From Easy to HardProceedings of the 31st ACM International Conference on Information & Knowledge Management10.1145/3511808.3557328(2784-2794)Online publication date: 17-Oct-2022
https://dl.acm.org/doi/10.1145/3511808.3557328
Jafarzadeh PEnsan F(2022)A semantic approach to post-retrieval query performance predictionInformation Processing and Management: an International Journal10.1016/j.ipm.2021.10274659:1Online publication date: 1-Jan-2022
https://dl.acm.org/doi/10.1016/j.ipm.2021.102746
Ganguly DDatta SMitra MGreene D(2022)An Analysis of Variations in the Effectiveness of Query Performance PredictionAdvances in Information Retrieval10.1007/978-3-030-99736-6_15(215-229)Online publication date: 5-Apr-2022
https://doi.org/10.1007/978-3-030-99736-6_15
Pal DGanguly DHasibi FFang YAizawa A(2021)Effective Query Formulation in Conversation Contextualization: A Query Specificity-based ApproachProceedings of the 2021 ACM SIGIR International Conference on Theory of Information Retrieval10.1145/3471158.3472237(177-183)Online publication date: 11-Jul-2021
https://dl.acm.org/doi/10.1145/3471158.3472237
Ferrante MFerro NFuhr N(2021)Towards Meaningful Statements in IR Evaluation: Mapping Evaluation Measures to Interval ScalesIEEE Access10.1109/ACCESS.2021.31168579(136182-136216)Online publication date: 2021
https://doi.org/10.1109/ACCESS.2021.3116857
Gallagher LMallia ACulpepper JSuel TCambazoglu Bd'Aquin MDietze SHauff CCurry ECudre Mauroux P(2020)Feature Extraction for Large-Scale Text CollectionsProceedings of the 29th ACM International Conference on Information & Knowledge Management10.1145/3340531.3412773(3015-3022)Online publication date: 19-Oct-2020
https://doi.org/10.1145/3340531.3412773
Yigit-Sert SAltingovde IMacdonald COunis IUlusoy Ö(2020)Supervised approaches for explicit search result diversificationInformation Processing & Management10.1016/j.ipm.2020.10235657:6(102356)Online publication date: Nov-2020
https://doi.org/10.1016/j.ipm.2020.102356
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Abstract

Cited By

Index Terms

Recommendations