skip to main content
10.1145/1835449.1835475acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

Learning to efficiently rank

Published: 19 July 2010 Publication History

Abstract

It has been shown that learning to rank approaches are capable of learning highly effective ranking functions. However, these approaches have mostly ignored the important issue of efficiency. Given that both efficiency and effectiveness are important for real search engines, models that are optimized for effectiveness may not meet the strict efficiency requirements necessary to deploy in a production environment. In this work, we present a unified framework for jointly optimizing effectiveness and efficiency. We propose new metrics that capture the tradeoff between these two competing forces and devise a strategy for automatically learning models that directly optimize the tradeoff metrics. Experiments indicate that models learned in this way provide a good balance between retrieval effectiveness and efficiency. With specific loss functions, learned models converge to familiar existing ones, which demonstrates the generality of our framework. Finally, we show that our approach naturally leads to a reduction in the variance of query execution times, which is important for query load balancing and user satisfaction.

References

[1]
V. Anh and A. Moffat. Pruned query evaluation using pre-computed impacts. SIGIR, p. 372--379, 2006.
[2]
R. Baeza-Yates, A. Gionis, F. Junqueira, V. Murdock, V. Plachouras, and F. Silvestri. The impact of caching on search engines. SIGIR, p. 183--190, 2007.
[3]
J. Bai, Y. Chang, H. Cui, Z. Zheng, G. Sun, and X. Li. Investigation of partial query proximity in web search. WWW, p. 1183--1184, 2008.
[4]
M. Bendersky, W. Croft, and D. Smith. Two-stage query segmentation for information retrieval. SIGIR, p. 810--811, 2009.
[5]
M. Bendersky, D. Metzler, and W. Croft. Learning concept importance using a weighted dependence model. WSDM, p. 31--40, 2010.
[6]
C. Burges, T. Shaked, E. Renshaw, A. Lazier, M. Deeds, N. Hamilton, and G. Hullender. Learning to rank using gradient descent. ICML, p. 89--96, 2005.
[7]
S. Buttcher and C. Clarke. Efficiency vs. effectiveness in terabyte-scale information retrieval. TREC, 2005.
[8]
S. Buttcher, C. Clarke, and P. Yeung. Indexing pruning and result reranking: Effects on ad-hoc retrieval and named page finding. TREC, 2006.
[9]
S. Buttcher, C. Clarke, and B. Lushman. Term proximity scoring for ad-hoc retrieval on very large text collections. SIGIR, p. 621--622, 2006.
[10]
D. Carmel, D. Cohen, R. Fagin, E. Farchi, M. Herscovici, Y. Maarek, and A. Soffer. Static indexing pruning for information retrieval systems. SIGIR, p. 43--50, 2001.
[11]
K. Collins-Thompson and J. Callan. Query expansion using random walk models. CIKM, p. 704--711, 2005.
[12]
F. Gey. Inferring probability of relevance using the method of logistic regression. SIGIR, p. 222--231, 1994.
[13]
M. Lease. An improved Markov Random Field model for supporting verbose queries. SIGIR, p. 476--483, 2009.
[14]
J. Lin, D. Metzler, T. Elsayed, and L. Wang. Of Ivory and Smurfs: Loxodontan MapReduce experiments for web search. TREC, 2009.
[15]
T.-Y. Liu. Learning to rank for information retrieval. Foundations and Trends in Information Retrieval, 3(3):225--331, 2009.
[16]
D. Metzler and W. Croft. A Markov Random Field model for term dependencies. SIGIR, p. 472--479, 2005.
[17]
D. Metzler and W. Croft. Linear feature-based models for information retrieval. Information Retrieval, 10(3):257--274, 2007.
[18]
R. Nallapati. Discriminative models for information retrieval. SIGIR, p. 64--71, 2004.
[19]
A. Ntoulas and J. Cho. Pruning policies for two-tiered inverted index with correctness guarantee. SIGIR, p. 191--198, 2007.
[20]
J. Ponte and W. Croft. A language modeling approach to information retrieval. SIGIR, p. 275--281, 1998.
[21]
S. Robertson, S. Walker, S. Jones, M. M. Hancock-Beaulieu, and M. Gatford. Okapi at TREC-3. TREC, p. 109--126, 1994.
[22]
T. Strohman, H. Turtle, and W. Croft. Optimization strategies for complex queries. SIGIR, p. 219--225, 2005.
[23]
T. Tao and C. Zhai. An exploration of proximity measures in information retrieval. SIGIR, p. 295--302, 2007.
[24]
R. Tibshirani. Regression shrinkage and selection via the lasso. J. Roy. Statist. Soc. Ser. B, 58(1):267--288, 1996.
[25]
Y. Yue and C. Burges. On using simultaneous perturbation stochastic approximation for IR measures, and the empirical optimality of LambdaRank. NIPS Machine Learning for Web Search Workshop, 2007.

Cited By

View all
  • (2024)Towards Effective and Efficient Sparse Neural Information RetrievalACM Transactions on Information Systems10.1145/363491242:5(1-46)Online publication date: 29-Apr-2024
  • (2023)Efficient Document-at-a-time and Score-at-a-time Query Evaluation for Learned Sparse RepresentationsACM Transactions on Information Systems10.1145/357692241:4(1-28)Online publication date: 22-Mar-2023
  • (2023)Early Exit Strategies for Learning-to-Rank CascadesIEEE Access10.1109/ACCESS.2023.333108811(126691-126704)Online publication date: 2023
  • Show More Cited By

Index Terms

  1. Learning to efficiently rank

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGIR '10: Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
    July 2010
    944 pages
    ISBN:9781450301534
    DOI:10.1145/1835449
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 19 July 2010

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. effectiveness and efficiency tradeoff
    2. learning to rank
    3. linear models

    Qualifiers

    • Research-article

    Conference

    SIGIR '10
    Sponsor:

    Acceptance Rates

    SIGIR '10 Paper Acceptance Rate 87 of 520 submissions, 17%;
    Overall Acceptance Rate 792 of 3,983 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)20
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 20 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Towards Effective and Efficient Sparse Neural Information RetrievalACM Transactions on Information Systems10.1145/363491242:5(1-46)Online publication date: 29-Apr-2024
    • (2023)Efficient Document-at-a-time and Score-at-a-time Query Evaluation for Learned Sparse RepresentationsACM Transactions on Information Systems10.1145/357692241:4(1-28)Online publication date: 22-Mar-2023
    • (2023)Early Exit Strategies for Learning-to-Rank CascadesIEEE Access10.1109/ACCESS.2023.333108811(126691-126704)Online publication date: 2023
    • (2022)H-ERNIEProceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3477495.3531986(1478-1489)Online publication date: 6-Jul-2022
    • (2022)Ensemble Model Compression for Fast and Energy-Efficient Ranking on FPGAsAdvances in Information Retrieval10.1007/978-3-030-99736-6_18(260-273)Online publication date: 5-Apr-2022
    • (2022)Scalability Challenges in Web Search EnginesundefinedOnline publication date: 10-Mar-2022
    • (2022)Learning to Rank for Information Retrieval and Natural Language ProcessingundefinedOnline publication date: 2-Apr-2022
    • (2021)Improving search engine efficiency through contextual factor selectionAI Magazine10.1609/aimag.v42i2.1509942:2(50-58)Online publication date: 1-Jun-2021
    • (2021)Learning Early Exit Strategies for Additive Ranking EnsemblesProceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3404835.3463088(2217-2221)Online publication date: 11-Jul-2021
    • (2020)Query-level Early Exit for Additive Learning-to-Rank EnsemblesProceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3397271.3401256(2033-2036)Online publication date: 25-Jul-2020
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media