poster

Wikipedia-based query performance prediction

Authors:
Gilad Katz

Ben-Gurion University, Beer-Sheva, Israel

Ben-Gurion University, Beer-Sheva, Israel
View Profile

,
Anna Shtock

Technion, Haifa, Israel

Technion, Haifa, Israel
View Profile

,
Oren Kurland

Technion, Haifa, Israel

Technion, Haifa, Israel
View Profile

,
Bracha Shapira

Ben-Gurion University, Beer-Sheva, Israel

Ben-Gurion University, Beer-Sheva, Israel
View Profile

,
Lior Rokach

Ben-Gurion University, Ben-Gurion University, Israel

Ben-Gurion University, Ben-Gurion University, Israel
View Profile

SIGIR '14: Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrievalJuly 2014Pages 1235–1238https://doi.org/10.1145/2600428.2609553

Published:03 July 2014Publication History

SIGIR '14: Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval

Pages 1235–1238

ABSTRACT

The query-performance prediction task is to estimate retrieval effectiveness with no relevance judgments. Pre-retrieval prediction methods operate prior to retrieval time. Hence, these predictors are often based on analyzing the query and the corpus upon which retrieval is performed. We propose a {\em corpus-independent} approach to pre-retrieval prediction which relies on information extracted from Wikipedia. Specifically, we present Wikipedia-based features that can attest to the effectiveness of retrieval performed in response to a query {\em regardless} of the corpus upon which search is performed. Empirical evaluation demonstrates the merits of our approach. As a case in point, integrating the Wikipedia-based features with state-of-the-art pre-retrieval predictors that analyze the corpus yields prediction quality that is consistently better than that of using the latter alone.

References

J. Arguello, J. L. Elsas, J. Callan, and J. G. Carbonell. Document representation and query expansion models for blog recommendation. In Proceedings of ICWSM, 2008.Google Scholar
N. Balasubramanian, G. Kumaran, and V. R. Carvalho. Predicting query performance on the web. In Proceedings of SIGIR, pages 785--786, 2010. Google ScholarDigital Library
K. Balog, M. Bron, and M. De Rijke. Category-based query modeling for entity search. In Advances in Information Retrieval, pages 319--331. Springer, 2010. Google ScholarDigital Library
J. Callan. Distributed information retrieval. In W. Croft, editor, Advances in information retrieval, chapter 5, pages 127--150. Kluwer Academic Publishers, 2000.Google Scholar
D. Carmel and E. Yom-Tov. Estimating the Query Difficulty for Information Retrieval. Synthesis Lectures on Information Concepts, Retrieval, and Services. Morgan & Claypool Publishers, 2010. Google ScholarDigital Library
D. Carmel, E. Yom-Tov, A. Darlow, and D. Pelleg. What makes a query difficult? In Proceedings of SIGIR, pages 390--397, 2006. Google ScholarDigital Library
K. Collins-Thompson and P. N. Bennett. Predicting query performance via classification. In Proceedings of ECIR, pages 140--152, 2010. Google ScholarDigital Library
S. Cronen-Townsend, Y. Zhou, and W. B. Croft. Predicting query performance. In Proceedings of SIGIR, pages 299--306, 2002. Google ScholarDigital Library
C. Hans. Bayesian lasso regression. Biometrika, 96(4):835--845, 2009.Google ScholarCross Ref
C. Hauff, L. Azzopardi, and D. Hiemstra. The combination and evaluation of query performance prediction methods. In Proceedings of ECIR, pages 301--312, 2009. Google ScholarDigital Library
C. Hauff, D. Hiemstra, and F. de Jong. A survey of pre-retrieval query performance predictors. In Proceedings of CIKM, pages 1419--1420, 2008. Google ScholarDigital Library
B. He and I. Ounis. Inferring query performance using pre-retrieval predictors. In Proceedings of SPIRE, pages 43--54, 2004.Google ScholarCross Ref
E. Hoque, G. Strong, O. Hoeber, and M. Gong. Conceptual query expansion and visual search results exploration for web image retrieval. In Advances in Intelligent Web Mastering--3, pages 73--82. Springer, 2011.Google Scholar
O. Kurland, A. Shtok, S. Hummel, F. Raiber, D. Carmel, and O. Rom. Back to the roots: a probabilistic framework for query-performance prediction. In Proceedings of CIKM, pages 823--832, 2012. Google ScholarDigital Library
J. Mothe and L. Tanguy. Linguistic features to predict query difficulty. In ACM SIGIR 2005 Workshop on Predicting Query Difficulty - Methods and Applications, 2005.Google Scholar
F. Scholer, H. E. Williams, and A. Turpin. Query association surrogates for web search. JASIST, 55(7):637--650, 2004. Google ScholarDigital Library
F. Song and W. B. Croft. A general language model for information retrieval (poster abstract). In Proceedings of SIGIR, pages 279--280, 1999. Google ScholarDigital Library
A.-M. Vercoustre, J. Pehcevski, and V. Naumovski. Topic difficulty prediction in entity ranking. In Proceedings of INEX, pages 280--291, 2009. Google ScholarDigital Library
Y. Xu, G. J. Jones, and B. Wang. Query dependent pseudo-relevance feedback based on wikipedia. In Proceedings of SIGIR, pages 59--66, 2009. Google ScholarDigital Library
E. Yom-Tov, S. Fine, D. Carmel, and A. Darlow. Learning to estimate query difficulty: including applications to missing content detection and distributed information retrieval. In Proceedings of SIGIR, pages 512--519, 2005. Google ScholarDigital Library
C. Zhai and J. D. Lafferty. A study of smoothing methods for language models applied to ad hoc information retrieval. In Proceedings of SIGIR, pages 334--342, 2001. Google ScholarDigital Library
Y. Zhao, F. Scholer, and Y. Tsegay. Effective pre-retrieval query performance prediction using similarity and variability evidence. In Proceedings of ECIR, pages 52--64, 2008. Google ScholarDigital Library

Index Terms

Wikipedia-based query performance prediction
1. Information systems
  1. Information retrieval
    1. Retrieval models and ranking

Recommendations

Query-performance prediction: setting the expectations straight
SIGIR '14: Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval

The query-performance prediction task has been described as estimating retrieval effectiveness in the absence of relevance judgments. The expectations throughout the years were that improved prediction techniques would translate to improved retrieval ...
Read More
Query-Performance Prediction Using Minimal Relevance Feedback
ICTIR '13: Proceedings of the 2013 Conference on the Theory of Information Retrieval

There has been much work on devising query-performance prediction approaches that estimate search effectiveness without relevance judgments (i.e., zero feedback). Specifically, post-retrieval predictors analyze the result list of top-retrieved ...
Read More
Query dependent pseudo-relevance feedback based on wikipedia
SIGIR '09: Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval

Pseudo-relevance feedback (PRF) via query-expansion has been proven to be e®ective in many information retrieval (IR) tasks. In most existing work, the top-ranked documents from an initial search are assumed to be relevant and used for PRF. One problem ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGIR '14: Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval
July 2014
1330 pages
ISBN:9781450322577
DOI:10.1145/2600428
General Chairs:
Shlomo Geva
Queensland University of Technology
,
Andrew Trotman
University of Dunedin
,
Program Chairs:
Peter Bruza
Queensland University of Technology
,
Charles L.A. Clarke
University of Waterloo
,
Kal Järvelin
University of Tampere
Copyright © 2014 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 3 July 2014
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
query-performance prediction
wikipedia
Qualifiers
- poster
Conference

Acceptance Rates
SIGIR '14 Paper Acceptance Rate82of387submissions,21%Overall Acceptance Rate792of3,983submissions,20%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 10
  Total Citations
  View Citations
- 352
  Total Downloads
- Downloads (Last 12 months)5
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Wikipedia-based query performance prediction

SIGIR '14: Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval

ABSTRACT

References

Cited By

Index Terms

Recommendations

Query-performance prediction: setting the expectations straight

Query-Performance Prediction Using Minimal Relevance Feedback

Query dependent pseudo-relevance feedback based on wikipedia