On statistical analysis and optimization of information retrieval effectiveness metrics

Published: 19 July 2010 Publication History


This paper presents a new way of thinking for IR metric optimization. It is argued that the optimal ranking problem should be factorized into two distinct yet interrelated stages: the relevance prediction stage and ranking decision stage. During retrieval the relevance of documents is not known a priori, and the joint probability of relevance is used to measure the uncertainty of documents' relevance in the collection as a whole. The resulting optimization objective function in the latter stage is, thus, the expected value of the IR metric with respect to this probability measure of relevance. Through statistically analyzing the expected values of IR metrics under such uncertainty, we discover and explain some interesting properties of IR metrics that have not been known before. Our analysis and optimization framework do not assume a particular (relevance) retrieval model and metric, making it applicable to many existing IR models and metrics. The experiments on one of resulting applications have demonstrated its significance in adapting to various IR metrics.


      SIGIR '10: Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
      July 2010
      944 pages
      Author Tags

      1. ir metrics
      2. learing to rank
      3. optimal ranking
      4. optimization
      5. ranking under uncertainty
      6. retrieval models


      SIGIR '10

      Acceptance Rates

      SIGIR '10 Paper Acceptance Rate 87 of 520 submissions, 17%;
      Overall Acceptance Rate 792 of 3,983 submissions, 20%


