research-article

Score distribution models: assumptions, intuition, and robustness to score manipulation

Authors:

Evangelos Kanoulas,

Javed A. AslamAuthors Info & Claims

SIGIR '10: Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval

Pages 242 - 249

https://doi.org/10.1145/1835449.1835491

Published: 19 July 2010 Publication History

Abstract

Inferring the score distribution of relevant and non-relevant documents is an essential task for many IR applications (e.g. information filtering, recall-oriented IR, meta-search, distributed IR). Modeling score distributions in an accurate manner is the basis of any inference. Thus, numerous score distribution models have been proposed in the literature. Most of the models were proposed on the basis of empirical evidence and goodness-of-fit. In this work, we model score distributions in a rather different, systematic manner. We start with a basic assumption on the distribution of terms in a document. Following the transformations applied on term frequencies by two basic ranking functions, BM25 and Language Models, we derive the distribution of the produced scores for all documents. Then we focus on the relevant documents. We detach our analysis from particular ranking functions. Instead, we consider a model for precision-recall curves, and given this model, we present a general mathematical framework which, given any score distribution for all retrieved documents, produces an analytical formula for the score distribution of relevant documents that is consistent with the precision-recall curves that follow the aforementioned model. In particular, assuming a Gamma distribution for all retrieved documents, we show that the derived distribution for the relevant documents resembles a Gaussian distribution with a heavy right tail.

References

[1]

A. Arampatzis, J. Kamps, and S. Robertson. Where to stop reading a ranked list?: threshold optimization using truncated score distributions. In SIGIR '09: Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, pages 524--531, New York, NY, USA, 2009. ACM.

Digital Library

[2]

A. Arampatzis and A. van Hameren. The score-distributional threshold optimization for adaptive binary classification tasks. In SIGIR '01: Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, pages 285--293, New York, NY, USA, 2001. ACM.

Digital Library

[3]

J. A. Aslam and E. Yilmaz. A geometric interpretation and analysis of R-precision. In Proceedings of the Fourteenth ACM International Conference on Information and Knowledge Management, pages 664--671. ACM Press, October 2005.

Digital Library

[4]

R. D. Barr and W. P. Zehna. Probability: Modelling Uncertainty. Addison-Wesley, 1983.

[5]

C. Baumgarten. A probabilistic solution to the selection and fusion problem in distributed information retrieval. In SIGIR '99: Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, pages 246--253, New York, NY, USA, 1999. ACM.

Digital Library

[6]

P. N. Bennett. Using asymmetric distributions to improve text classifier probability estimates. In SIGIR '03: Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval, pages 111--118, New York, NY, USA, 2003. ACM.

Digital Library

[7]

A. Bookstein. When the most 'pertinent' document should not be retrieved:an analysis of the swets model. Information Processing & Management, 13(6):377--383, 1977.

[8]

A. Bookstein and D. R. Swanson. Probabilistic models for automatic indexing. Journal of the American Society for Information Science, 25(5):312--318, 1974.

[9]

K. Collins-Thompson, P. Ogilvie, Y. Zhang, and J. Callan. Information filtering, novelty detection, and named-page finding. In In Proceedings of the 11th Text Retrieval Conference, 2003.

[10]

S. P. Harter. A probabilistic approach to automatic keyword indexing: Part i. on the distribution of specialty words in a technical literature. Journal of the American Society for Information Science, 26(4):197--206, 1975).

[11]

E. Kanoulas, V. Pavlu, K. Dai, and J. A. Aslam. Modeling the score distributions of relevant and non-relevnat documents. In In Proceedings of the 2nd International Conference on the Theory of Information Retrieval, September 2009.

Digital Library

[12]

R. Manmatha, T. Rath, and F. Feng. Modeling score distributions for combining the outputs of search engines. In SIGIR '01: Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, pages 267--275, New York, NY, USA, 2001. ACM.

Digital Library

[13]

M. F. Neuts and S. Zacks. On mixtures of Ç2- and f-distributions which yield distributions of the same family. Annals of the Institute of Statistical Mathematics, 19(1):527--536, 1966.

[14]

S. Robertson. On score distributions and relevance. In G. Amati, C. Carpineto, and G. Romano, editors, Advances in Information Retrieval, 29th European Conference on IR Research, ECIR 2007, volume 4425/2007 of Lecture Notes in Computer Science, pages 40--51. Springer, June 2007.

Digital Library

[15]

M. Spitters and W. Kraaij. A language modeling approach to tracking news events. In Proceedings of TDT workshop 2000, pages 101--106, 2000.

[16]

J. A. Swets. Information retrieval systems. Science, 141(3577):245--250, July 1963.

[17]

J. A. Swets. Effectiveness of information retrieval methods. American Documentation, 20:72--89, 1969.

[18]

M. Wiper, D. R. Insua, and F. Ruggeri. Mixtures of gamma distributions with applications. Journal of Computational and Graphical Statistics, 10(3):440--454, September 2001.

[19]

Y. Zhang and J. Callan. Maximum likelihood estimation for filtering thresholds. In SIGIR '01: Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, pages 294--302, New York, NY, USA, 2001. ACM.

Digital Library

Cited By

Stevenson MBin-Hezam R(2023)Stopping Methods for Technology-assisted Reviews Based on Point ProcessesACM Transactions on Information Systems10.1145/363199042:3(1-37)Online publication date: 29-Dec-2023
https://dl.acm.org/doi/10.1145/3631990
Zamani HBendersky MMetzler DZhuang HWang XCrestani FPasi GGaussier E(2022)Stochastic Retrieval-Conditioned RerankingProceedings of the 2022 ACM SIGIR International Conference on Theory of Information Retrieval10.1145/3539813.3545141(81-91)Online publication date: 23-Aug-2022
https://dl.acm.org/doi/10.1145/3539813.3545141
Zhou LBhuyan LRamakrishnan K(2022)Cottage: Coordinated Time Budget Assignment for Latency, Quality and Power Optimization in Web Search2022 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA53966.2022.00017(113-125)Online publication date: Apr-2022
https://doi.org/10.1109/HPCA53966.2022.00017
Show More Cited By

Index Terms

Score distribution models: assumptions, intuition, and robustness to score manipulation
1. Information systems
  1. Information retrieval
    1. Retrieval models and ranking

Recommendations

Where to stop reading a ranked list?: threshold optimization using truncated score distributions
SIGIR '09: Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval

Ranked retrieval has a particular disadvantage in comparison with traditional Boolean retrieval: there is no clear cut-off point where to stop consulting results. This is a serious problem in some setups. We investigate and further develop methods to ...
A signal-to-noise approach to score normalization
CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge management

Score normalization is indispensable in distributed retrieval and fusion or meta-search where merging of result-lists is required. Distributional approaches to score normalization with reference to relevance, such as binary mixture models like the ...
Document Score Distribution Models for Query Performance Inference and Prediction

Modelling the distribution of document scores returned from an information retrieval (IR) system in response to a query is of both theoretical and practical importance. One of the goals of modelling document scores in this manner is the inference of ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGIR '10: Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval

July 2010

944 pages

ISBN:9781450301534

DOI:10.1145/1835449

General Chairs:
Fabio Crestani
University of Lugano, CH
,
Stéphane Marchand-Maillet
University of Geneva, CH
,
Program Chairs:
Hsin-Hsi Chen
National Taiwan University, TW
,
Efthimis N. Efthimiadis
University of Washington, USA
,
Jacques Savoy
University of Neuchatel, CH

Copyright © 2010 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGIR: ACM Special Interest Group on Information Retrieval

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 July 2010

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

SIGIR '10

Sponsor:

SIGIR

SIGIR '10: The 33rd International ACM SIGIR conference on research and development in Information Retrieval

July 19 - 23, 2010

Geneva, Switzerland

Acceptance Rates

SIGIR '10 Paper Acceptance Rate 87 of 520 submissions, 17%;

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

21
Total Citations
View Citations
541
Total Downloads

Downloads (Last 12 months)6
Downloads (Last 6 weeks)1

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Stevenson MBin-Hezam R(2023)Stopping Methods for Technology-assisted Reviews Based on Point ProcessesACM Transactions on Information Systems10.1145/363199042:3(1-37)Online publication date: 29-Dec-2023
https://dl.acm.org/doi/10.1145/3631990
Zamani HBendersky MMetzler DZhuang HWang XCrestani FPasi GGaussier E(2022)Stochastic Retrieval-Conditioned RerankingProceedings of the 2022 ACM SIGIR International Conference on Theory of Information Retrieval10.1145/3539813.3545141(81-91)Online publication date: 23-Aug-2022
https://dl.acm.org/doi/10.1145/3539813.3545141
Zhou LBhuyan LRamakrishnan K(2022)Cottage: Coordinated Time Budget Assignment for Latency, Quality and Power Optimization in Web Search2022 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA53966.2022.00017(113-125)Online publication date: Apr-2022
https://doi.org/10.1109/HPCA53966.2022.00017
Davis SLoyola CPeralta J(2022)Statistical inference for unreliable grading using the maximum entropy principleChaos: An Interdisciplinary Journal of Nonlinear Science10.1063/5.010692232:12(123103)Online publication date: Dec-2022
https://doi.org/10.1063/5.0106922
Poghosyan GIfrim GAtzenbeck CRubart JMillard D(2019)SocialTreeProceedings of the 30th ACM Conference on Hypertext and Social Media10.1145/3342220.3343668(153-162)Online publication date: 12-Sep-2019
https://dl.acm.org/doi/10.1145/3342220.3343668
Parapar JLosada DPresedo‐Quindimil MBarreiro A(2019)Using score distributions to compare statistical significance tests for information retrieval evaluationJournal of the Association for Information Science and Technology10.1002/asi.2420371:1(98-113)Online publication date: 4-Dec-2019
https://dl.acm.org/doi/10.1002/asi.24203
Chuang MKulkarni A(2017)Improving Shard Selection for Selective SearchInformation Retrieval Technology10.1007/978-3-319-70145-5_3(29-41)Online publication date: 8-Nov-2017
https://doi.org/10.1007/978-3-319-70145-5_3
Komatsuda TKeyaki AMiyazaki J(2016)A Score Fusion Method Using a Mixture CopulaDatabase and Expert Systems Applications10.1007/978-3-319-44406-2_16(216-232)Online publication date: 6-Aug-2016
https://doi.org/10.1007/978-3-319-44406-2_16
Aly R(2014)Score Normalization Using Logistic Regression with Expected ParametersProceedings of the 36th European Conference on IR Research on Advances in Information Retrieval - Volume 841610.5555/2964060.2964101(579-584)Online publication date: 13-Apr-2014
https://dl.acm.org/doi/10.5555/2964060.2964101
Cummins R(2014)Document Score Distribution Models for Query Performance Inference and PredictionACM Transactions on Information Systems10.1145/255917032:1(1-28)Online publication date: 1-Jan-2014
https://dl.acm.org/doi/10.1145/2559170
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten