research-article

A new rank correlation coefficient for information retrieval

Authors:

Javed A. Aslam,

Stephen RobertsonAuthors Info & Claims

SIGIR '08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval

Pages 587 - 594

https://doi.org/10.1145/1390334.1390435

Published: 20 July 2008 Publication History

Abstract

In the field of information retrieval, one is often faced with the problem of computing the correlation between two ranked lists. The most commonly used statistic that quantifies this correlation is Kendall's Τ. Often times, in the information retrieval community, discrepancies among those items having high rankings are more important than those among items having low rankings. The Kendall's Τ statistic, however, does not make such distinctions and equally penalizes errors both at high and low rankings.

In this paper, we propose a new rank correlation coefficient, AP correlation (Τap), that is based on average precision and has a probabilistic interpretation. We show that the proposed statistic gives more weight to the errors at high rankings and has nice mathematical properties which make it easy to interpret. We further validate the applicability of the statistic using experimental data.

References

[1]

J. A. Aslam, V. Pavlu, and R. Savell. A unified model for metasearch, pooling, and system evaluation. In O. Frieder, J. Hammer, S. Quershi, and L. Seligman, editors, Proceedings of the Twelfth International Conference on Information and Knowledge Management, pages 484--491. ACM Press, November 2003.

Digital Library

[2]

C. Buckley and E. M. Voorhees. Retrieval evaluation with incomplete information. In Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 25--32, New York, NY, USA, 2004. ACM Press.

Digital Library

[3]

B. Carterette and J. Allan. Incremental test collections. In CIKM '05: Proceedings of the 14th ACM international conference on Information and knowledge management, pages 680--687, New York, NY, USA, 2005. ACM Press.

Digital Library

[4]

T. Cover and J. Thomas. Elements of Information Theory. Wiley, 1991.

Digital Library

[5]

R. Fagin, R. Kumar, and D. Sivakumar. Comparing top k lists. In SODA '03: Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms, pages 28--36, Philadelphia, PA, USA, 2003. Society for Industrial and Applied Mathematics.

Digital Library

[6]

T. H. Haveliwala, A. Gionis, D. Klein, and P. Indyk. Evaluating strategies for similarity search on the web. In WWW '02: Proceedings of the 11th international conference on World Wide Web, pages 432--442, New York, NY, USA, 2002. ACM.

Digital Library

[7]

M. Kendall. A new measure of rank correlation. Biometrica, 30(1-2):81--89, 1938.

[8]

M. Melucci. On rank correlation in information retrieval evaluation. SIGIR Forum, 41(1):18--33, 2007.

Digital Library

[9]

M. Sanderson and H. Joho. Forming test collections with no system pooling. In SIGIR '04: Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval, pages 33--40, New York, NY, USA, 2004. ACM.

Digital Library

[10]

M. Sanderson and I. Soboroff. Problems with kendall's tau. In SIGIR '07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, pages 839--840, New York, NY, USA, 2007. ACM.

Digital Library

[11]

G. S. Shieh. A weighted kendall's tau statistic. Statistics & Probability Letters, 39:17--24, 1998.

[12]

I. Soboroff, C. Nicholas, and P. Cahan. Ranking retrieval systems without relevance judgments. In Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 66--73, New Orleans, Louisiana, USA, Sept. 2001. ACM Press, New York.

Digital Library

[13]

E. M. Voorhees. Evaluation by highly relevant documents. In Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 74--82. ACM Press, 2001.

Digital Library

[14]

E. M. Voorhees. Overview of the TREC 2004 robust retrieval track. In Proceedings of the Thirteenth Text REtrieval Conference (TREC 2004), 2004.

[15]

D. D. Wackerly, W. Mendenhall, and R. L. Scheaffer. Mathematical Statistics with Applications. Duxbury Advanced Series, 2002.

[16]

S. Wu and F. Crestani. Methods for ranking information retrieval systems without relevance judgments. In SAC '03: Proceedings of the 2003 ACM symposium on Applied computing, pages 811--816, 2003.

Digital Library

[17]

E. Yilmaz and J. A. Aslam. Estimating average precision with incomplete and imperfect judgments. In Proceedings of the Fifteenth ACM International Conference on Information and Knowledge Management. ACM Press, November 2006.

Digital Library

Cited By

Corsi MUrbano JSakai TIshita EOhshima HHasibi FMao JJose J(2024)How do Ties Affect the Uncertainty in Rank-Biased Overlap?Proceedings of the 2024 Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region10.1145/3673791.3698422(125-134)Online publication date: 8-Dec-2024
https://dl.acm.org/doi/10.1145/3673791.3698422
Moffat AMackenzie JMallia APetri MSakai TIshita EOhshima HHasibi FMao JJose J(2024)Rank-Biased Quality Measurement for Sets and RankingsProceedings of the 2024 Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region10.1145/3673791.3698405(135-144)Online publication date: 8-Dec-2024
https://dl.acm.org/doi/10.1145/3673791.3698405
Corsi MUrbano JHui Yang GWang HHan SHauff CZuccon GZhang Y(2024)The Treatment of Ties in Rank-Biased OverlapProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657700(251-260)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3626772.3657700
Show More Cited By

Index Terms

A new rank correlation coefficient for information retrieval
1. Information systems
  1. Information retrieval
    1. Evaluation of retrieval results

Recommendations

The Treatment of Ties in AP Correlation
ICTIR '17: Proceedings of the ACM SIGIR International Conference on Theory of Information Retrieval

The Kendall tau and AP correlation coefficients are very commonly use to compare two rankings over the same set of items. Even though Kendall tau was originally defined assuming that there are no ties in the rankings, two alternative versions were soon ...
On rank correlation and the distance between rankings
SIGIR '09: Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval

Rank correlation statistics are useful for determining whether a there is a correspondence between two measurements, particularly when the measures themselves are of less interest than their relative ordering. Kendall's - in particular has found use in ...
Order Statistics Correlation Coefficient as a Novel Association Measurement With Applications to Biosignal Analysis

In this paper, we propose a novel correlation coefficient based on order statistics and rearrangement inequality. The proposed coefficient represents a compromise between the Pearson's linear coefficient and the two rank-based coefficients, namely ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGIR '08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval

July 2008

934 pages

ISBN:9781605581644

DOI:10.1145/1390334

General Chairs:
Tat-Seng Chua
National University of Singapore
,
Mun-Kew Leong
National Library Board, Singapore
,
Program Chairs:
Syung Hyon Myaeng
Information and Communications University, Korea
,
Douglas W. Oard
University of Maryland, College Park, USA
,
Fabrizio Sebastiani
Consiglio Nazionale delle Ricerche, Italy

Copyright © 2008 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 July 2008

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

SIGIR '08

Sponsor:

SIGIR '08: The 31st Annual International ACM SIGIR Conference

July 20 - 24, 2008

Singapore, Singapore

Acceptance Rates

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

198
Total Citations
View Citations
2,282
Total Downloads

Downloads (Last 12 months)116
Downloads (Last 6 weeks)9

Reflects downloads up to 07 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Corsi MUrbano JSakai TIshita EOhshima HHasibi FMao JJose J(2024)How do Ties Affect the Uncertainty in Rank-Biased Overlap?Proceedings of the 2024 Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region10.1145/3673791.3698422(125-134)Online publication date: 8-Dec-2024
https://dl.acm.org/doi/10.1145/3673791.3698422
Moffat AMackenzie JMallia APetri MSakai TIshita EOhshima HHasibi FMao JJose J(2024)Rank-Biased Quality Measurement for Sets and RankingsProceedings of the 2024 Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region10.1145/3673791.3698405(135-144)Online publication date: 8-Dec-2024
https://dl.acm.org/doi/10.1145/3673791.3698405
Corsi MUrbano JHui Yang GWang HHan SHauff CZuccon GZhang Y(2024)The Treatment of Ties in Rank-Biased OverlapProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657700(251-260)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3626772.3657700
Wang NHuang YCheng HGesi JWang XMittal VChua TNgo CKumar RLauw HKa-Wei Lee R(2024)Towards Robustness Analysis of E-Commerce Ranking SystemCompanion Proceedings of the ACM Web Conference 202410.1145/3589335.3648335(364-373)Online publication date: 13-May-2024
https://dl.acm.org/doi/10.1145/3589335.3648335
Bouzebda SSoukarieh I(2024) Limit theorems for a class of processes generalizing the U -empirical process Stochastics10.1080/17442508.2024.232040296:1(799-845)Online publication date: 12-Mar-2024
https://doi.org/10.1080/17442508.2024.2320402
Chattamvelli RChattamvelli R(2024)Rank CorrelationCorrelation in Engineering and the Applied Sciences10.1007/978-3-031-51015-1_3(77-106)Online publication date: 8-Mar-2024
https://doi.org/10.1007/978-3-031-51015-1_3
Fan YDalvi FDurrani NSajjad HOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)Evaluating neuron interpretation methods of NLP modelsProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3669428(75644-75668)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3669428
Duchemin QDe Castro YLacour C(2023)Concentration inequality for U-statistics of order two for uniformly ergodic Markov chainsBernoulli10.3150/22-BEJ148529:2Online publication date: 1-May-2023
https://doi.org/10.3150/22-BEJ1485
Khramtsova EZhuang SBaktashmotlagh MWang XZuccon G(2023)Selecting which Dense Retriever to use for Zero-Shot SearchProceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region10.1145/3624918.3625330(223-233)Online publication date: 26-Nov-2023
https://dl.acm.org/doi/10.1145/3624918.3625330
Otero DParapar JFerro NFrommholz IHopfgartner FLee MOakes MLalmas MZhang MSantos R(2023)How Discriminative Are Your Qrels? How To Study the Statistical Significance of Document Adjudication MethodsProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3614916(1960-1970)Online publication date: 21-Oct-2023
https://dl.acm.org/doi/10.1145/3583780.3614916
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten