research-article

PRES: a score metric for evaluating recall-oriented information retrieval applications

Authors:
Walid Magdy

Dublin City University, Dublin, Ireland

Dublin City University, Dublin, Ireland
View Profile

,
Gareth J.F. Jones

Dublin City University, Dublin, Ireland

Dublin City University, Dublin, Ireland
View Profile

SIGIR '10: Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrievalJuly 2010Pages 611–618https://doi.org/10.1145/1835449.1835551

Published:19 July 2010Publication History

SIGIR '10: Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval

Pages 611–618

ABSTRACT

Information retrieval (IR) evaluation scores are generally designed to measure the effectiveness with which relevant documents are identified and retrieved. Many scores have been proposed for this purpose over the years. These have primarily focused on aspects of precision and recall, and while these are often discussed with equal importance, in practice most attention has been given to precision focused metrics. Even for recall-oriented IR tasks of growing importance, such as patent retrieval, these precision based scores remain the primary evaluation measures. Our study examines different evaluation measures for a recall-oriented patent retrieval task and demonstrates the limitations of the current scores in comparing different IR systems for this task. We introduce PRES, a novel evaluation metric for this type of application taking account of recall and the user's search effort. The behaviour of PRES is demonstrated on 48 runs from the CLEF-IP 2009 patent retrieval track. A full analysis of the performance of PRES shows its suitability for measuring the retrieval effectiveness of systems from a recall focused perspective taking into account the user's expected search effort.

References

Ali M. S., Consens, M. P., Kazai, G., and Lalmas, M. Structural relevance: A common basis for the evaluation of structured document retrieval. In Proceedings of CIKM '08, pages 1153--1162, 2008. Google ScholarDigital Library
Aslam J. A., and E. Yilmaz. Estimating average precision with incomplete and imperfect judgments. In Proceedings of CIKM' 06, page 102--111, 2006. Google ScholarDigital Library
Azzopardi L., de Rijke, M., and K. Balog. Building simulated queries for known-item topics: an analysis using six european languages. In Proceedings of SIGIR '07, pages 455--462, 2007. Google ScholarDigital Library
Azzopardi, L. and Vinay, V. Retrievability. An evaluation measure for higher order information access tasks. In Proccedings of CIKM '08, pages 1425--1426, 2008. Google ScholarDigital Library
Baeza-Yates, J., and Ribeiro-Neto, B. Modern Information Retrieval. Addison Wesley, 1999. Google ScholarDigital Library
Bashir, S., and Rauber A. Analyzing Document Retrievability in Patent Retrieval Settings. In Proceedings of Database and Expert Systems Applications (DEXA 2009), pages 753--760, 2009. Google ScholarDigital Library
Buckley, C., and Voorhees, E. M. Evaluating Evaluation Measure Stability. In Proceedings of SIGIR 2000, pages 33--40, 2000. Google ScholarDigital Library
Buckley, C., Dimmick, D., Soboroff, I., and E. Voorhees. Bias and the limits of pooling. In Proceedings of SIGIR '06, pages 619--620, 2006. Google ScholarDigital Library
Buckley, C., and Voorhees, E. M. Retrieval evaluation with incomplete information. In Proceedings of SIGIR '04, pages 25--32, 2004. Google ScholarDigital Library
Carterette, B., Bennett, P. N. Chickering, D. M., and Dumais, S. T. Here or There: Preference Judgments for Relevance. In Proceedings of ECIR '08, pages 16--27, 2008. Google ScholarDigital Library
Cleverdon, C. The Cranfield Tests on Index Language Devices. In: Sparck Jones, K. and Willett, P. (eds.). Readings in Information Retrieval, pages 47--59, Morgan Kaufmann, 1997. Google ScholarDigital Library
Hull, D. Using statistical testing in the evaluation of retrieval experiments. In Proceedings of SIGIR '93, pages 329--338, 1993. Google ScholarDigital Library
Fujii, A., Iwayama, M., and Kando, N. Overview of Patent Retrieval Task at NTCIR-4. In Proceedings of the 4th NTCIR Workshop, 2004.Google Scholar
Graf, E., and Azzopardi, L. A methodology for building a patent test collection for prior art search. In Proceedings of the 2nd EVIA Workshop, pages 60--71, 2008.Google Scholar
Jordan, C., Watters, C., and Gao, Q. Using controlled query generation to evaluate blind relevance feedback algorithms. In Proceedings of JCDL '06, pages 286--295, 2006. Google ScholarDigital Library
Kamps, J., Pehcevski, J., Kazai, G., Lalmas, M., and Robertson, S. INEX 2007 evaluation measures. In Proceedings of INEX '07, pages 24--33, 2007.Google Scholar
Kendall, M. A new measure of rank correlation. Biometrika, 30(1/2):81--93, 1938.Google ScholarCross Ref
Mandl, T. Recent developments in the evaluation of information retrieval systems: moving toward diversity and practical applications. Informatica, 32:27--38, 2008.Google Scholar
Moffat, A., and Zobel, J. Rank-biased precision for measurement of retrieval effectiveness. ACM Trans. Inf. Syst. 27(1):1--27, 2008. Google ScholarDigital Library
Oard, D. W., Hedin, B., Tomlinson, S., and Baron, J. R. Overview of the TREC 2008 legal track. In Proceedings of TREC 2008, 2008.Google Scholar
van Rijsbergen, C. J. Information Retrieval, 2nd edition. Butterworths, 1979. Google ScholarDigital Library
Robertson S. E. The parametric description of the retrieval tests. Part 2: Overall measures. Journal of Documentation, 25(2):93--107, 1969.Google ScholarCross Ref
Robertson, S. A new interpretation of average precision. In Proceedings of SIGIR '08, pages 689--690, 2008. Google ScholarDigital Library
Rocchio J. Performance indices for document retrieval systems. In Information storage and retrieval, Computation Laboratory of Harvard University, Cambridge, MA, 1964.Google Scholar
Roda G., Tait J., Piroi F., and Zenz V. CLEF-IP 2009: retrieval experiments in the Intellectual Property domain. In Proceedings of CLEF '09, 2009. Google ScholarDigital Library
Tague J., Nelson, M., and Wu, H. Problems in the simulation of bibliographic retrieval systems. In Proceeding of SIGIR '81, pages 66--71, 1981. Google ScholarDigital Library
Tomlinson S., Oard, D. W., Baron, J. R., and Thompson, P. Overview of the TREC 2007 Legal Track. In Proceedings of TREC 2007, 2007.Google Scholar
Voorhees, E. M., and Tice, D. M. The TREC-8 Question Answering Track Evaluation. In Proceedings of TREC 1999, pages 77--82, 1999.Google Scholar
Voorhees, E. M. The Philosophy of Information Retrieval Evaluation. In Evaluation of Cross-Language Information Retrieval System, Proceedings of CLEF '02, pages 355--370, 2002. Google ScholarDigital Library
Voorhees, E. M. The TREC robust retrieval track. In SIGIR Forum 39(1):11--20, 2005. Google ScholarDigital Library
Xue, X., and Croft W. B. Automatic Query Generation for Patent Search. In Proceedings of CIKM'09, pages 2037--2040, 2009. Google ScholarDigital Library
Zhu, J., and Tait, J. A proposal for chemical information retrieval evaluation. In In Proceedings of the 1st ACM Workshop on Patent Information Retrieval at CIKM '08, pages 15--18, 2008. Google ScholarDigital Library

Index Terms

PRES: a score metric for evaluating recall-oriented information retrieval applications
1. Information systems
  1. Information retrieval
    1. Evaluation of retrieval results

Recommendations

Proposal of two-stage patent retrieval method considering the claim structure

The importance of patents is increasing in global society. In preparing a patent application, it is essential to search for related patents that may invalidate the invention. However, it is time-consuming to identify them among the millions of patents. ...
Read More
Learning-Based pseudo-relevance feedback for patent retrieval
IRFC'12: Proceedings of the 5th conference on Multidisciplinary Information Retrieval

Pseudo-relevance feedback (PRF) is an effective approach in Information Retrieval but unfortunately many experiments have shown that PRF is ineffective in patent retrieval. This is because the quality of initial results in the patent retrieval is poor ...
Read More
An empirical study on retrieval models for different document genres: patents and newspaper articles
SIGIR '03: Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval

Reflecting the rapid growth in the utilization of large test collections for information retrieval since the 1990s, extensive comparative experiments have been performed to explore the effectiveness of various retrieval models. However, most collections ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGIR '10: Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
July 2010
944 pages
ISBN:9781450301534
DOI:10.1145/1835449
General Chairs:
Fabio Crestani
University of Lugano, CH
,
Stéphane Marchand-Maillet
University of Geneva, CH
,
Program Chairs:
Hsin-Hsi Chen
National Taiwan University, TW
,
Efthimis N. Efthimiadis
University of Washington, USA
,
Jacques Savoy
University of Neuchatel, CH
Copyright © 2010 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 19 July 2010
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
PRES
evaluation metric
patent retrieval
recall-oriented information retrieval
Qualifiers
- research-article
Conference

Acceptance Rates
SIGIR '10 Paper Acceptance Rate87of520submissions,17%Overall Acceptance Rate792of3,983submissions,20%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 69
  Total Citations
  View Citations
- 868
  Total Downloads
- Downloads (Last 12 months)54
- Downloads (Last 6 weeks)10
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

PRES: a score metric for evaluating recall-oriented information retrieval applications

SIGIR '10: Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval

ABSTRACT

References

Cited By

Index Terms

Recommendations

Proposal of two-stage patent retrieval method considering the claim structure

Learning-Based pseudo-relevance feedback for patent retrieval

An empirical study on retrieval models for different document genres: patents and newspaper articles