skip to main content
10.1145/1390156.1390246acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicmlConference Proceedingsconference-collections
research-article

Cost-sensitive multi-class classification from probability estimates

Published:05 July 2008Publication History

ABSTRACT

For two-class classification, it is common to classify by setting a threshold on class probability estimates, where the threshold is determined by ROC curve analysis. An analog for multi-class classification is learning a new class partitioning of the multiclass probability simplex to minimize empirical misclassification costs. We analyze the interplay between systematic errors in the class probability estimates and cost matrices for multiclass classification. We explore the effect on the class partitioning of five different transformations of the cost matrix. Experiments on benchmark datasets with naive Bayes and quadratic discriminant analysis show the effectiveness of learning a new partition matrix compared to previously proposed methods.

References

  1. Ayer, M., Brunk, H., Ewing, G., Reid, W., & Silverman, E. (1955). An empirical distribution function for sampling with incomplete information. Annals of Mathematical Statistics, 4, 641--647.Google ScholarGoogle ScholarCross RefCross Ref
  2. Deng, K., Bourke, C., Scott, S., & Vinodchandran, N. (2006). New algorithms for optimizing multiclass classifiers with ROC surfaces. Proc. of the ICML 2006 Workshop on ROC Analysis in Machine Learning. Pittsburgh, USA.Google ScholarGoogle Scholar
  3. Domingos, P. (1999). Metacost: A general method for making classifiers cost-sensitive. Proc. of 5th International Conference on Knowledge Discovery and Data Mining (pp. 155--164). San Diego, CA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Egan, J. (1975). Signal detection theory and ROC-analysis. New York: Academic Press.Google ScholarGoogle Scholar
  5. Friedman, J. H. (1989). Regularized discriminant analysis. Journal of the American Statistical Association, 84, 165--175.Google ScholarGoogle ScholarCross RefCross Ref
  6. Friedman, J. H. (1997). On bias, variance, 0/1-loss, and the curse-of-dimensionality. Data Mining and Knowledge Discovery, 1, 55--77. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Hanley, J., & McNeil, B. (1982). The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology, 143, 29--36.Google ScholarGoogle ScholarCross RefCross Ref
  8. Hastie, T., Tibshirani, R., & Friedman, J. (2001). The elements of statistical learning. New York: Springer-Verlag.Google ScholarGoogle Scholar
  9. Lachiche, N., & Flach, P. (2003). Improving accuracy and cost of two-class and multi-class probabilistic classifiers using ROC curves. Proc. of 20th International Conference on Machine Learning (pp. 416--423). Washington DC.Google ScholarGoogle Scholar
  10. Mossman, D. (1999). Three-way ROCs. Medical Decision Making, 19, 78--98.Google ScholarGoogle ScholarCross RefCross Ref
  11. Niculescu-Mizil, A., & Caruana, R. (2005). Predicting good probabilities with supervised learning. Proc. of 22nd International Conference on Machine Learning. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Noe, D. (1983). Selecting a diagnostic study's cutoff value by using its receiver operating characteristic curve. Clinical Chemistry, 29, 571--2.Google ScholarGoogle ScholarCross RefCross Ref
  13. O'Brien, D. B. (2006). Cost-sensitive performance of probability-estimation based classifiers: analysis and practice. Doctoral dissertation, Stanford University.Google ScholarGoogle Scholar
  14. Platt, J. (2000). Probabilistic outputs for support vector machines and comparison to regularized likelihood methods. Advances in Large Margin Classifiers (pp. 61--74).Google ScholarGoogle Scholar
  15. Provost, F., & Fawcett, T. (2001). Robust classification for imprecise environments. Machine Learning, 42, 203--231. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Zadrozny, B., & Elkan, C. (2001). Obtaining calibrated probability estimates from decision trees and naïïve Bayesian classifiers. Proc. of 18th International Conference on Machine Learning (pp. 609--616). Morgan Kaufmann Publishers, Inc. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Zadrozny, B., & Elkan, C. (2002). Transforming classifier scores into accurate multiclass probability estimates. Proc. of 8th International Conference on Knowledge Discovery and Data Mining (pp. 694--699). ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Cost-sensitive multi-class classification from probability estimates

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in
            • Published in

              cover image ACM Other conferences
              ICML '08: Proceedings of the 25th international conference on Machine learning
              July 2008
              1310 pages
              ISBN:9781605582054
              DOI:10.1145/1390156

              Copyright © 2008 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 5 July 2008

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • research-article

              Acceptance Rates

              Overall Acceptance Rate140of548submissions,26%

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader