skip to main content
10.1145/1390334.1390435acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

A new rank correlation coefficient for information retrieval

Published: 20 July 2008 Publication History

Abstract

In the field of information retrieval, one is often faced with the problem of computing the correlation between two ranked lists. The most commonly used statistic that quantifies this correlation is Kendall's Τ. Often times, in the information retrieval community, discrepancies among those items having high rankings are more important than those among items having low rankings. The Kendall's Τ statistic, however, does not make such distinctions and equally penalizes errors both at high and low rankings.
In this paper, we propose a new rank correlation coefficient, AP correlationap), that is based on average precision and has a probabilistic interpretation. We show that the proposed statistic gives more weight to the errors at high rankings and has nice mathematical properties which make it easy to interpret. We further validate the applicability of the statistic using experimental data.

References

[1]
J. A. Aslam, V. Pavlu, and R. Savell. A unified model for metasearch, pooling, and system evaluation. In O. Frieder, J. Hammer, S. Quershi, and L. Seligman, editors, Proceedings of the Twelfth International Conference on Information and Knowledge Management, pages 484--491. ACM Press, November 2003.
[2]
C. Buckley and E. M. Voorhees. Retrieval evaluation with incomplete information. In Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 25--32, New York, NY, USA, 2004. ACM Press.
[3]
B. Carterette and J. Allan. Incremental test collections. In CIKM '05: Proceedings of the 14th ACM international conference on Information and knowledge management, pages 680--687, New York, NY, USA, 2005. ACM Press.
[4]
T. Cover and J. Thomas. Elements of Information Theory. Wiley, 1991.
[5]
R. Fagin, R. Kumar, and D. Sivakumar. Comparing top k lists. In SODA '03: Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms, pages 28--36, Philadelphia, PA, USA, 2003. Society for Industrial and Applied Mathematics.
[6]
T. H. Haveliwala, A. Gionis, D. Klein, and P. Indyk. Evaluating strategies for similarity search on the web. In WWW '02: Proceedings of the 11th international conference on World Wide Web, pages 432--442, New York, NY, USA, 2002. ACM.
[7]
M. Kendall. A new measure of rank correlation. Biometrica, 30(1-2):81--89, 1938.
[8]
M. Melucci. On rank correlation in information retrieval evaluation. SIGIR Forum, 41(1):18--33, 2007.
[9]
M. Sanderson and H. Joho. Forming test collections with no system pooling. In SIGIR '04: Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval, pages 33--40, New York, NY, USA, 2004. ACM.
[10]
M. Sanderson and I. Soboroff. Problems with kendall's tau. In SIGIR '07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, pages 839--840, New York, NY, USA, 2007. ACM.
[11]
G. S. Shieh. A weighted kendall's tau statistic. Statistics & Probability Letters, 39:17--24, 1998.
[12]
I. Soboroff, C. Nicholas, and P. Cahan. Ranking retrieval systems without relevance judgments. In Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 66--73, New Orleans, Louisiana, USA, Sept. 2001. ACM Press, New York.
[13]
E. M. Voorhees. Evaluation by highly relevant documents. In Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 74--82. ACM Press, 2001.
[14]
E. M. Voorhees. Overview of the TREC 2004 robust retrieval track. In Proceedings of the Thirteenth Text REtrieval Conference (TREC 2004), 2004.
[15]
D. D. Wackerly, W. Mendenhall, and R. L. Scheaffer. Mathematical Statistics with Applications. Duxbury Advanced Series, 2002.
[16]
S. Wu and F. Crestani. Methods for ranking information retrieval systems without relevance judgments. In SAC '03: Proceedings of the 2003 ACM symposium on Applied computing, pages 811--816, 2003.
[17]
E. Yilmaz and J. A. Aslam. Estimating average precision with incomplete and imperfect judgments. In Proceedings of the Fifteenth ACM International Conference on Information and Knowledge Management. ACM Press, November 2006.

Cited By

View all
  • (2024)How do Ties Affect the Uncertainty in Rank-Biased Overlap?Proceedings of the 2024 Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region10.1145/3673791.3698422(125-134)Online publication date: 8-Dec-2024
  • (2024)Rank-Biased Quality Measurement for Sets and RankingsProceedings of the 2024 Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region10.1145/3673791.3698405(135-144)Online publication date: 8-Dec-2024
  • (2024)The Treatment of Ties in Rank-Biased OverlapProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657700(251-260)Online publication date: 10-Jul-2024
  • Show More Cited By

Index Terms

  1. A new rank correlation coefficient for information retrieval

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGIR '08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
    July 2008
    934 pages
    ISBN:9781605581644
    DOI:10.1145/1390334
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 20 July 2008

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Kendall's tau
    2. average precision
    3. evaluation
    4. rank correlation

    Qualifiers

    • Research-article

    Conference

    SIGIR '08
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 792 of 3,983 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)116
    • Downloads (Last 6 weeks)9
    Reflects downloads up to 07 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)How do Ties Affect the Uncertainty in Rank-Biased Overlap?Proceedings of the 2024 Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region10.1145/3673791.3698422(125-134)Online publication date: 8-Dec-2024
    • (2024)Rank-Biased Quality Measurement for Sets and RankingsProceedings of the 2024 Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region10.1145/3673791.3698405(135-144)Online publication date: 8-Dec-2024
    • (2024)The Treatment of Ties in Rank-Biased OverlapProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657700(251-260)Online publication date: 10-Jul-2024
    • (2024)Towards Robustness Analysis of E-Commerce Ranking SystemCompanion Proceedings of the ACM Web Conference 202410.1145/3589335.3648335(364-373)Online publication date: 13-May-2024
    • (2024) Limit theorems for a class of processes generalizing the U -empirical process Stochastics10.1080/17442508.2024.232040296:1(799-845)Online publication date: 12-Mar-2024
    • (2024)Rank CorrelationCorrelation in Engineering and the Applied Sciences10.1007/978-3-031-51015-1_3(77-106)Online publication date: 8-Mar-2024
    • (2023)Evaluating neuron interpretation methods of NLP modelsProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3669428(75644-75668)Online publication date: 10-Dec-2023
    • (2023)Concentration inequality for U-statistics of order two for uniformly ergodic Markov chainsBernoulli10.3150/22-BEJ148529:2Online publication date: 1-May-2023
    • (2023)Selecting which Dense Retriever to use for Zero-Shot SearchProceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region10.1145/3624918.3625330(223-233)Online publication date: 26-Nov-2023
    • (2023)How Discriminative Are Your Qrels? How To Study the Statistical Significance of Document Adjudication MethodsProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3614916(1960-1970)Online publication date: 21-Oct-2023
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media