research-article

Exploring reductions for long web queries

Authors:

Niranjan Balasubramanian,

Giridhar Kumaran,

Vitor R. CarvalhoAuthors Info & Claims

SIGIR '10: Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval

Pages 571 - 578

https://doi.org/10.1145/1835449.1835545

Published: 19 July 2010 Publication History

Abstract

Long queries form a difficult, but increasingly important segment for web search engines. Query reduction, a technique for dropping unnecessary query terms from long queries, improves performance of ad-hoc retrieval on TREC collections. Also, it has great potential for improving long web queries (upto 25% improvement in NDCG@5). However, query reduction on the web is hampered by the lack of accurate query performance predictors and the constraints imposed by search engine architectures and ranking algorithms.

In this paper, we present query reduction techniques for long web queries that leverage effective and efficient query performance predictors. We propose three learning formulations that combine these predictors to perform automatic query reduction. These formulations enable trading of average improvements for the number of queries impacted, and enable easy integration into the search engine's architecture for rank-time query reduction. Experiments on a large collection of long queries issued to a commercial search engine show that the proposed techniques significantly outperform baselines, with more than 12% improvement in NDCG@5 in the impacted set of queries. Extension to the formulations such as result interleaving further improves results. We find that the proposed techniques deliver consistent retrieval gains where it matters most: poorly performing long web queries.

References

[1]

Searches getting longer: A weblog by alan long, hitwise intelligence. http://weblogs.hitwise.com/alan-long/2009/11/searches_getting_longer.html.

[2]

N. Balasubramanian, G. Kumaran, and V. Carvalho. Predicting query performance on the web. In SIGIR 2010.

Digital Library

[3]

M. Bendersky and W. Croft. Discovering key concepts in verbose queries. In SIGIR, pages 491--498, 2008.

Digital Library

[4]

M. Bendersky and W. B. Croft. Analysis of long queries in a large scale search log. In WSCD, pages 8--14, 2009.

Digital Library

[5]

M. Bendersky, D. Metzler, and W. B. Croft. Learning concept importance using a weighted dependence model. In WSDM '10, pages 31--40, 2010.

Digital Library

[6]

C. Burges, R. Ragno, and Q. Le. Learning to rank with nonsmooth cost functions. NIPS, 19:193, 2007.

[7]

Y. Chen and Y.-Q. Zhang. A query substitution - search result refinement approach for long query web searches. In WI-IAT, pages 245--251, 2009.

Digital Library

[8]

C. Hauff, V. Murdock, and R. Baeza-Yates. Improved query difficulty prediction for the web. In CIKM, pages 439--448, 2008.

Digital Library

[9]

B. He and I. Ounis. Inferring query performance using pre-retrieval predictors. In SPIRE, pages 43--54, 2004.

[10]

K. Jarvelin and J. Kekalainen. Cumulated gain-based evaluation of ir techniques. ACM Transactions on Information Systems, 20(4):422--446, 2002.

Digital Library

[11]

T. Joachims. Optimizing search engines using clickthrough data. In SIGKDD, pages 133--142, 2002.

Digital Library

[12]

G. Kumaran and J. Allan. A case for shorter queries, and helping users create them. In HLT/NAACL, pages 220--227, 2007.

[13]

G. Kumaran and V. Carvalho. Reducing long queries using query quality predictors. In SIGIR, pages 564--571, 2009.

Digital Library

[14]

M. Lease. An improved markov random field model for supporting verbose queries. In SIGIR, pages 476--483, 2009.

Digital Library

[15]

M. Lease, J. Allan, and W. B. Croft. Regression rank: Learning to meet the opportunity of descriptive queries. In ECIR, pages 90--101, 2009.

Digital Library

[16]

C. Lee, Y. Lin, R. Chen, and P. Cheng. Selecting Effective Terms for Query Formulation. In AIRS 2009, pages 168--180, 2009.

Digital Library

[17]

C.-J. Lee, R.-C. Chen, S.-H. Kao, and P.-J. Cheng. A term dependency-based approach for query terms ranking. In CIKM '09, pages 1267--1276, 2009.

Digital Library

[18]

A. Liaw and M. Wiener. Classification and regression by randomforest. R News, 2(3):18--22, 2002.

Cited By

Ma ZDou ZXu WZhang XJiang HCao ZWen JDemartini GZuccon GCulpepper JHuang ZTong H(2021)Pre-training for Ad-hoc RetrievalProceedings of the 30th ACM International Conference on Information & Knowledge Management10.1145/3459637.3482286(1212-1221)Online publication date: 26-Oct-2021
https://dl.acm.org/doi/10.1145/3459637.3482286
Pang WDuan R(2021)History-Aware Expansion and Fuzzy for Query ReformulationArtificial Intelligence10.1007/978-3-030-93049-3_19(227-238)Online publication date: 5-Jun-2021
https://dl.acm.org/doi/10.1007/978-3-030-93049-3_19
Ragkhitwetsagul CKrinke J(2019)SiameseEmpirical Software Engineering10.1007/s10664-019-09697-724:4(2236-2284)Online publication date: 1-Aug-2019
https://dl.acm.org/doi/10.1007/s10664-019-09697-7
Show More Cited By

Index Terms

Exploring reductions for long web queries
1. Information systems
  1. Information retrieval

Recommendations

Information Retrieval with Verbose Queries
SIGIR '15: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval

Recently, the focus of many novel search applications shifted from short keyword queries to verbose natural language queries. Examples include question answering systems and dialogue systems, voice search on mobile devices and entity search engines like ...
Synthesizing high utility suggestions for rare web search queries
SIGIR '11: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval

Search engines are continuously looking into methods to alleviate users' effort in finding desired information. For this, all major search engines employ query suggestions methods to facilitate effective query formulation and reformulation. Providing ...
Learning to rank query reformulations
SIGIR '10: Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval

Query reformulation techniques based on query logs have recently proven to be effective for web queries. However, when initial queries have reasonably good quality, these techniques are often not reliable enough to identify the helpful reformulations ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGIR '10: Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval

July 2010

944 pages

ISBN:9781450301534

DOI:10.1145/1835449

General Chairs:
Fabio Crestani
University of Lugano, CH
,
Stéphane Marchand-Maillet
University of Geneva, CH
,
Program Chairs:
Hsin-Hsi Chen
National Taiwan University, TW
,
Efthimis N. Efthimiadis
University of Washington, USA
,
Jacques Savoy
University of Neuchatel, CH

Copyright © 2010 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGIR: ACM Special Interest Group on Information Retrieval

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 July 2010

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

SIGIR '10

Sponsor:

SIGIR

SIGIR '10: The 33rd International ACM SIGIR conference on research and development in Information Retrieval

July 19 - 23, 2010

Geneva, Switzerland

Acceptance Rates

SIGIR '10 Paper Acceptance Rate 87 of 520 submissions, 17%;

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

58
Total Citations
View Citations
700
Total Downloads

Downloads (Last 12 months)9
Downloads (Last 6 weeks)1

Reflects downloads up to 20 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Ma ZDou ZXu WZhang XJiang HCao ZWen JDemartini GZuccon GCulpepper JHuang ZTong H(2021)Pre-training for Ad-hoc RetrievalProceedings of the 30th ACM International Conference on Information & Knowledge Management10.1145/3459637.3482286(1212-1221)Online publication date: 26-Oct-2021
https://dl.acm.org/doi/10.1145/3459637.3482286
Pang WDuan R(2021)History-Aware Expansion and Fuzzy for Query ReformulationArtificial Intelligence10.1007/978-3-030-93049-3_19(227-238)Online publication date: 5-Jun-2021
https://dl.acm.org/doi/10.1007/978-3-030-93049-3_19
Ragkhitwetsagul CKrinke J(2019)SiameseEmpirical Software Engineering10.1007/s10664-019-09697-724:4(2236-2284)Online publication date: 1-Aug-2019
https://dl.acm.org/doi/10.1007/s10664-019-09697-7
Scells HZuccon GCollins-Thompson KMei QDavison BLiu YYilmaz E(2018)Generating Better Queries for Systematic ReviewsThe 41st International ACM SIGIR Conference on Research & Development in Information Retrieval10.1145/3209978.3210020(475-484)Online publication date: 27-Jun-2018
https://dl.acm.org/doi/10.1145/3209978.3210020
Chaudhary CGoyal PRuben Antony Moniz JGoyal NChen YAizawa KLew MSatoh S(2018)Linguistic Patterns and Cross Modality-based Image Retrieval for Complex QueriesProceedings of the 2018 ACM on International Conference on Multimedia Retrieval10.1145/3206025.3206050(257-265)Online publication date: 5-Jun-2018
https://dl.acm.org/doi/10.1145/3206025.3206050
Zhang YLo DXia XJiang JSun JKhomh FRoy CSiegmund J(2018)Recommending frequently encountered bugsProceedings of the 26th Conference on Program Comprehension10.1145/3196321.3196348(120-131)Online publication date: 28-May-2018
https://dl.acm.org/doi/10.1145/3196321.3196348
Kyozuka MTajima K(2018)Ranking Methods for Query Relaxation in Book Search2018 IEEE/WIC/ACM International Conference on Web Intelligence (WI)10.1109/WI.2018.00-51(466-473)Online publication date: Dec-2018
https://doi.org/10.1109/WI.2018.00-51
Wang YFang H(2018)Key Terms Guided Expansion for Verbose Queries in Medical DomainInformation Retrieval Technology10.1007/978-3-030-03520-4_14(143-156)Online publication date: 17-Nov-2018
https://doi.org/10.1007/978-3-030-03520-4_14
Van Gysel CMitra BVenanzi MRosemarin RKukla GGrudzien PCancedda NLim EWinslett MSanderson MFu ASun JCulpepper SLo EHo JDonato DAgrawal RZheng YCastillo CSun ATseng VLi C(2017)Reply WithProceedings of the 2017 ACM on Conference on Information and Knowledge Management10.1145/3132847.3132979(327-336)Online publication date: 6-Nov-2017
https://dl.acm.org/doi/10.1145/3132847.3132979
Yang PFang HKamps JKanoulas Ede Rijke MFang HYilmaz E(2017)Can Short Queries Be Even Shorter?Proceedings of the ACM SIGIR International Conference on Theory of Information Retrieval10.1145/3121050.3121056(43-50)Online publication date: 1-Oct-2017
https://dl.acm.org/doi/10.1145/3121050.3121056
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten