skip to main content
10.1145/1143844.1143874acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicmlConference Proceedingsconference-collections
Article

The relationship between Precision-Recall and ROC curves

Published:25 June 2006Publication History

ABSTRACT

Receiver Operator Characteristic (ROC) curves are commonly used to present results for binary decision problems in machine learning. However, when dealing with highly skewed datasets, Precision-Recall (PR) curves give a more informative picture of an algorithm's performance. We show that a deep connection exists between ROC space and PR space, such that a curve dominates in ROC space if and only if it dominates in PR space. A corollary is the notion of an achievable PR curve, which has properties much like the convex hull in ROC space; we show an efficient algorithm for computing this curve. Finally, we also note differences in the two types of curves are significant for algorithm design. For example, in PR space it is incorrect to linearly interpolate between points. Furthermore, algorithms that optimize the area under the ROC curve are not guaranteed to optimize the area under the PR curve.

References

  1. Bockhorst, J., & Craven, M. (2005). Markov networks for detecting overlapping elements in sequence data. Neural Information Processing Systems 17 (NIPS). MIT Press.Google ScholarGoogle Scholar
  2. Bradley, A. (1997). The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognition, 30, 1145--1159. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Bunescu, R., Ge, R., Kate, R., Marcotte, E., Mooney, R., Ramani, A., & Wong, Y. (2004). Comparative Experiments on Learning Information Extractors for Proteins and their Interactions. Journal of Artificial Intelligence in Medicine, 139--155. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Cormen, T. H., Leiserson, Charles, E., & Rivest, R. L. (1990). Introduction to algorithms. MIT Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Cortes, C., & Mohri, M. (2003). AUC optimization vs. error rate minimization. Neural Information Processing Systems 15 (NIPS). MIT Press.Google ScholarGoogle Scholar
  6. Davis, J., Burnside, E., Dutra, I., Page, D., Ramakrishnan, R., Costa, V. S., & Shavlik, J. (2005). View learning for statistical relational learning: With an application to mammography. Proceeding of the 19th International Joint Conference on Artificial Intelligence. Edinburgh, Scotland. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Drummond, C., & Holte, R. (2000). Explicitly representing expected cost: an alternative to ROC representation. Proceeding of Knowledge Discovery and Datamining (pp. 198--207). Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Drummond, C., & Holte, R. C. (2004). What ROC curves can't do (and cost curves can). ROCAI (pp. 19--26).Google ScholarGoogle Scholar
  9. Ferri, C., Flach, P., & Henrandez-Orallo, J. (2002). Learning decision trees using area under the ROC curve. Proceedings of the 19th International Conference on Machine Learning (pp. 139--146). Morgan Kaufmann. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Freund, Y., Iyer, R., Schapire, R., & Singer, Y. (1998). An efficient boosting algorithm for combining preferences. Proceedings of the 15th International Conference on Machine Learning (pp. 170--178). Madison, US: Morgan Kaufmann Publishers, San Francisco, US. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Goadrich, M., Oliphant, L., & Shavlik, J. (2004). Learning ensembles of first-order clauses for recall-precision curves: A case study in biomedical information extraction. Proceedings of the 14th International Conference on Inductive Logic Programming (ILP). Porto, Portugal. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Herschtal, A., & Raskutti, B. (2004). Optimising area under the ROC curve using gradient descent. Proceedings of the 21st International Conference on Machine Learning (p. 49). New York, NY, USA: ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Joachims, T. (2005). A support vector method for multi-variate performance measures. Proceedings of the 22nd International Conference on Machine Learning. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Kok, S., & Domingos, P. (2005). Learning the structure of Markov Logic Networks. Proceedings of 22nd International Conference on Machine Learning (pp. 441--448). ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Macskassy, S., & Provost, F. (2005). Suspicion scoring based on guilt-by-association, collective inference, and focused data access. International Conference on Intelligence Analysis.Google ScholarGoogle Scholar
  16. Manning, C., & Schutze, H. (1999). Foundations of statistical natural language processing. MIT Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Prati, R., & Flach, P. (2005). ROCCER: an algorithm for rule learning based on ROC analysis. Proceeding of the 19th International Joint Conference on Artificial Intelligence. Edinburgh, Scotland. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Provost, F., Fawcett, T., & Kohavi, R. (1998). The case against accuracy estimation for comparing induction algorithms. Proceeding of the 15th International Conference on Machine Learning (pp. 445--453). Morgan Kaufmann, San Francisco, CA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Raghavan, V., Bollmann, P., & Jung, G. S. (1989). A critical investigation of recall and precision as measures of retrieval system performance. ACM Trans. Inf. Syst., 7, 205--229. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Singla, P., & Domingos, P. (2005). Discriminative training of Markov Logic Networks. Proceedings of the 20th National Conference on Artificial Intelligene (AAAI) (pp. 868--873). AAAI Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Srinivasan, A. (2003). The Aleph Manual Version 4. http://web.comlab.ox.ac.uk/oucl/research/areas/machlearn/Aleph/.Google ScholarGoogle Scholar
  22. Yan, L., Dodier, R., Mozer, M., & Wolniewicz, R. (2003). Optimizing classifier performance via the Wilcoxon-Mann-Whitney statistics. Proceedings of the 20th International Conference on Machine Learning.Google ScholarGoogle Scholar

Index Terms

  1. The relationship between Precision-Recall and ROC curves

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in
            • Published in

              cover image ACM Other conferences
              ICML '06: Proceedings of the 23rd international conference on Machine learning
              June 2006
              1154 pages
              ISBN:1595933832
              DOI:10.1145/1143844

              Copyright © 2006 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 25 June 2006

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • Article

              Acceptance Rates

              ICML '06 Paper Acceptance Rate140of548submissions,26%Overall Acceptance Rate140of548submissions,26%

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader