skip to main content
10.1145/1835449.1835551acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

PRES: a score metric for evaluating recall-oriented information retrieval applications

Published:19 July 2010Publication History

ABSTRACT

Information retrieval (IR) evaluation scores are generally designed to measure the effectiveness with which relevant documents are identified and retrieved. Many scores have been proposed for this purpose over the years. These have primarily focused on aspects of precision and recall, and while these are often discussed with equal importance, in practice most attention has been given to precision focused metrics. Even for recall-oriented IR tasks of growing importance, such as patent retrieval, these precision based scores remain the primary evaluation measures. Our study examines different evaluation measures for a recall-oriented patent retrieval task and demonstrates the limitations of the current scores in comparing different IR systems for this task. We introduce PRES, a novel evaluation metric for this type of application taking account of recall and the user's search effort. The behaviour of PRES is demonstrated on 48 runs from the CLEF-IP 2009 patent retrieval track. A full analysis of the performance of PRES shows its suitability for measuring the retrieval effectiveness of systems from a recall focused perspective taking into account the user's expected search effort.

References

  1. Ali M. S., Consens, M. P., Kazai, G., and Lalmas, M. Structural relevance: A common basis for the evaluation of structured document retrieval. In Proceedings of CIKM '08, pages 1153--1162, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Aslam J. A., and E. Yilmaz. Estimating average precision with incomplete and imperfect judgments. In Proceedings of CIKM' 06, page 102--111, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Azzopardi L., de Rijke, M., and K. Balog. Building simulated queries for known-item topics: an analysis using six european languages. In Proceedings of SIGIR '07, pages 455--462, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Azzopardi, L. and Vinay, V. Retrievability. An evaluation measure for higher order information access tasks. In Proccedings of CIKM '08, pages 1425--1426, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Baeza-Yates, J., and Ribeiro-Neto, B. Modern Information Retrieval. Addison Wesley, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Bashir, S., and Rauber A. Analyzing Document Retrievability in Patent Retrieval Settings. In Proceedings of Database and Expert Systems Applications (DEXA 2009), pages 753--760, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Buckley, C., and Voorhees, E. M. Evaluating Evaluation Measure Stability. In Proceedings of SIGIR 2000, pages 33--40, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Buckley, C., Dimmick, D., Soboroff, I., and E. Voorhees. Bias and the limits of pooling. In Proceedings of SIGIR '06, pages 619--620, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Buckley, C., and Voorhees, E. M. Retrieval evaluation with incomplete information. In Proceedings of SIGIR '04, pages 25--32, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Carterette, B., Bennett, P. N. Chickering, D. M., and Dumais, S. T. Here or There: Preference Judgments for Relevance. In Proceedings of ECIR '08, pages 16--27, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Cleverdon, C. The Cranfield Tests on Index Language Devices. In: Sparck Jones, K. and Willett, P. (eds.). Readings in Information Retrieval, pages 47--59, Morgan Kaufmann, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Hull, D. Using statistical testing in the evaluation of retrieval experiments. In Proceedings of SIGIR '93, pages 329--338, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Fujii, A., Iwayama, M., and Kando, N. Overview of Patent Retrieval Task at NTCIR-4. In Proceedings of the 4th NTCIR Workshop, 2004.Google ScholarGoogle Scholar
  14. Graf, E., and Azzopardi, L. A methodology for building a patent test collection for prior art search. In Proceedings of the 2nd EVIA Workshop, pages 60--71, 2008.Google ScholarGoogle Scholar
  15. Jordan, C., Watters, C., and Gao, Q. Using controlled query generation to evaluate blind relevance feedback algorithms. In Proceedings of JCDL '06, pages 286--295, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Kamps, J., Pehcevski, J., Kazai, G., Lalmas, M., and Robertson, S. INEX 2007 evaluation measures. In Proceedings of INEX '07, pages 24--33, 2007.Google ScholarGoogle Scholar
  17. Kendall, M. A new measure of rank correlation. Biometrika, 30(1/2):81--93, 1938.Google ScholarGoogle ScholarCross RefCross Ref
  18. Mandl, T. Recent developments in the evaluation of information retrieval systems: moving toward diversity and practical applications. Informatica, 32:27--38, 2008.Google ScholarGoogle Scholar
  19. Moffat, A., and Zobel, J. Rank-biased precision for measurement of retrieval effectiveness. ACM Trans. Inf. Syst. 27(1):1--27, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Oard, D. W., Hedin, B., Tomlinson, S., and Baron, J. R. Overview of the TREC 2008 legal track. In Proceedings of TREC 2008, 2008.Google ScholarGoogle Scholar
  21. van Rijsbergen, C. J. Information Retrieval, 2nd edition. Butterworths, 1979. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Robertson S. E. The parametric description of the retrieval tests. Part 2: Overall measures. Journal of Documentation, 25(2):93--107, 1969.Google ScholarGoogle ScholarCross RefCross Ref
  23. Robertson, S. A new interpretation of average precision. In Proceedings of SIGIR '08, pages 689--690, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Rocchio J. Performance indices for document retrieval systems. In Information storage and retrieval, Computation Laboratory of Harvard University, Cambridge, MA, 1964.Google ScholarGoogle Scholar
  25. Roda G., Tait J., Piroi F., and Zenz V. CLEF-IP 2009: retrieval experiments in the Intellectual Property domain. In Proceedings of CLEF '09, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Tague J., Nelson, M., and Wu, H. Problems in the simulation of bibliographic retrieval systems. In Proceeding of SIGIR '81, pages 66--71, 1981. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Tomlinson S., Oard, D. W., Baron, J. R., and Thompson, P. Overview of the TREC 2007 Legal Track. In Proceedings of TREC 2007, 2007.Google ScholarGoogle Scholar
  28. Voorhees, E. M., and Tice, D. M. The TREC-8 Question Answering Track Evaluation. In Proceedings of TREC 1999, pages 77--82, 1999.Google ScholarGoogle Scholar
  29. Voorhees, E. M. The Philosophy of Information Retrieval Evaluation. In Evaluation of Cross-Language Information Retrieval System, Proceedings of CLEF '02, pages 355--370, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Voorhees, E. M. The TREC robust retrieval track. In SIGIR Forum 39(1):11--20, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Xue, X., and Croft W. B. Automatic Query Generation for Patent Search. In Proceedings of CIKM'09, pages 2037--2040, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Zhu, J., and Tait, J. A proposal for chemical information retrieval evaluation. In In Proceedings of the 1st ACM Workshop on Patent Information Retrieval at CIKM '08, pages 15--18, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. PRES: a score metric for evaluating recall-oriented information retrieval applications

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      SIGIR '10: Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
      July 2010
      944 pages
      ISBN:9781450301534
      DOI:10.1145/1835449

      Copyright © 2010 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 19 July 2010

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      SIGIR '10 Paper Acceptance Rate87of520submissions,17%Overall Acceptance Rate792of3,983submissions,20%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader