ACM Home Page
Please provide us with feedback. Feedback
The relationship between Precision-Recall and ROC curves
Full text PdfPdf (174 KB)
Source ACM International Conference Proceeding Series; Vol. 148 archive
Proceedings of the 23rd international conference on Machine learning table of contents
Pittsburgh, Pennsylvania
Pages: 233 - 240  
Year of Publication: 2006
ISBN:1-59593-383-2
Authors
Jesse Davis  University of Wisconsin-Madison, Madison, WI
Mark Goadrich  University of Wisconsin-Madison, Madison, WI
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 27,   Downloads (12 Months): 236,   Citation Count: 2
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
Save this Article to a Binder    Display Formats: BibTex  EndNote ACM Ref   
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1143844.1143874
What is a DOI?

ABSTRACT

Receiver Operator Characteristic (ROC) curves are commonly used to present results for binary decision problems in machine learning. However, when dealing with highly skewed datasets, Precision-Recall (PR) curves give a more informative picture of an algorithm's performance. We show that a deep connection exists between ROC space and PR space, such that a curve dominates in ROC space if and only if it dominates in PR space. A corollary is the notion of an achievable PR curve, which has properties much like the convex hull in ROC space; we show an efficient algorithm for computing this curve. Finally, we also note differences in the two types of curves are significant for algorithm design. For example, in PR space it is incorrect to linearly interpolate between points. Furthermore, algorithms that optimize the area under the ROC curve are not guaranteed to optimize the area under the PR curve.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Bockhorst, J., & Craven, M. (2005). Markov networks for detecting overlapping elements in sequence data. Neural Information Processing Systems 17 (NIPS). MIT Press.
 
2
Bradley, A. (1997). The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognition, 30, 1145--1159.
 
3
Bunescu, R., Ge, R., Kate, R., Marcotte, E., Mooney, R., Ramani, A., & Wong, Y. (2004). Comparative Experiments on Learning Information Extractors for Proteins and their Interactions. Journal of Artificial Intelligence in Medicine, 139--155.
 
4
 
5
Cortes, C., & Mohri, M. (2003). AUC optimization vs. error rate minimization. Neural Information Processing Systems 15 (NIPS). MIT Press.
 
6
Davis, J., Burnside, E., Dutra, I., Page, D., Ramakrishnan, R., Costa, V. S., & Shavlik, J. (2005). View learning for statistical relational learning: With an application to mammography. Proceeding of the 19th International Joint Conference on Artificial Intelligence. Edinburgh, Scotland.
7
 
8
Drummond, C., & Holte, R. C. (2004). What ROC curves can't do (and cost curves can). ROCAI (pp. 19--26).
 
9
 
10
 
11
Goadrich, M., Oliphant, L., & Shavlik, J. (2004). Learning ensembles of first-order clauses for recall-precision curves: A case study in biomedical information extraction. Proceedings of the 14th International Conference on Inductive Logic Programming (ILP). Porto, Portugal.
12
13
14
 
15
Macskassy, S., & Provost, F. (2005). Suspicion scoring based on guilt-by-association, collective inference, and focused data access. International Conference on Intelligence Analysis.
 
16
 
17
Prati, R., & Flach, P. (2005). ROCCER: an algorithm for rule learning based on ROC analysis. Proceeding of the 19th International Joint Conference on Artificial Intelligence. Edinburgh, Scotland.
 
18
19
 
20
Singla, P., & Domingos, P. (2005). Discriminative training of Markov Logic Networks. Proceedings of the 20th National Conference on Artificial Intelligene (AAAI) (pp. 868--873). AAAI Press.
 
21
Srinivasan, A. (2003). The Aleph Manual Version 4. http://web.comlab.ox.ac.uk/oucl/research/areas/machlearn/Aleph/.
 
22
Yan, L., Dodier, R., Mozer, M., & Wolniewicz, R. (2003). Optimizing classifier performance via the Wilcoxon-Mann-Whitney statistics. Proceedings of the 20th International Conference on Machine Learning.


Collaborative Colleagues:
Jesse Davis: colleagues
Mark Goadrich: colleagues