skip to main content
10.1145/2811681.2811699acmotherconferencesArticle/Chapter ViewAbstractPublication PagesaswecConference Proceedingsconference-collections
short-paper

Performance Evaluation of Ensemble Methods For Software Fault Prediction: An Experiment

Authors Info & Claims
Published:28 September 2015Publication History

ABSTRACT

In object-oriented software development, a plethora of studies have been carried out to present the application of machine learning algorithms for fault prediction. Furthermore, it has been empirically validated that an ensemble method can improve classification performance as compared to a single classifier. But, due to the inherent differences among machine learning and data mining approaches, the classification performance of ensemble methods will be varied. In this study, we investigated and evaluated the performance of different ensemble methods with itself and base-level classifiers, in predicting the faults proneness classes. Subsequently, we used three ensemble methods AdaboostM1, Vote and StackingC with five base-level classifiers namely Naivebayes, Logistic, J48, VotedPerceptron and SMO in Weka tool. In order to evaluate the performance of ensemble methods, we retrieved twelve datasets of open source projects from PROMISE repository. In this experiment, we used k-fold (k=10) cross-validation and ROC analysis for validation. Besides, we used recall, precision, accuracy, F-value measures to evaluate the performance of ensemble methods and base-level Classifiers. Finally, we observed significant performance improvement of applying ensemble methods as compared to its base-level classifier, and among ensemble methods we observed StackingC outperformed other selected ensemble methods for software fault prediction.

References

  1. S. R. Chidamber and C. F. Kemerer, A Metrics Suite for OO Design, IEEE Transaction on SE, Vol 20, No 6, Jun 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. F. B. Abreu and R. Carapuc, Object-Oriented software engineering: measuring and controlling the development Process, Proceedings of the 4th international conference on software quality, vol 186, 1994.Google ScholarGoogle Scholar
  3. J. Bansiya. and C. G. Davis. A hierarchical model for OO design quality assessment, IEEE Transaction on SE, 2002 Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. L. C. Briand, W. L. Melo, J. Wu, "Assessing the Applicability of Fault Proneness Models Across OO Software Projects", IEEE Transaction on SE, 28 (7), pp. 706--720, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. D. Radjenovic et al, Software Fault Prediction Metrics : A systematic literature review, Journal of Information and Software Technology, 2013 Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. S. Kanmani et al, Object-oriented software fault prediction using neural networks, Jounal of Information and Software Technology, 9(5): p. 483--492, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. R. Subramanyam and M. S. Kishnan, Empirical Analysis of CK Metrics for OO Design Complexity: Implication for Software Defects, IEEE Transaction on SE, Vol. 29, 2003 Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Yogesh Singh, Arvinder Kaur and Ruchika Malhotra, Software Fault Proneness Prediction Using Support Vector Machines, Proceedings of the World Congress on Engineering (WCE), Volume I. 2009.Google ScholarGoogle Scholar
  9. Zenko B, Todorovski L. and Dzeroski S., A comparison of stacking with MDTs to bagging, boosting and other Stacking methods, In Proceedings of the First IEEE International Conference on Data Mining, pp 669--670, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. M. Jureczko and L. Madeyski, Towards identifying software project clusters with regard to defect prediction. Proceedings of the 6th International Conference on Predictive Models in Software Engineering (PROMISE '10), USA, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. V. Basili, L. Briand, and W. Melo, A Validation of Object-OO Metrics as Quality Indicators, IEEE Transaction on SE, vol. 22, no. 10, pp. 751--761, Oct. 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. G. J. Pai and B. Dugan, Empirical Analysis of Software Fault Content and Fault Proneness Using Bayesian Methods, IEEE Transaction on SE, vol. 33, no. 10, Oct 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. E. Alpaydm, Techniques for Combining Multiple Learners, Proceddings of Engineering of Intelligent Systems'98 Conference, Vol 2, 6--12, 1998.Google ScholarGoogle Scholar
  14. T. M. Khoshgoftaar, P. Rebours and N. Seliya, Software Quality analysis by Combining Multiple Projects and Learners, Journal of Software Quality Control, vol 55, pp. 119--139, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. M. C. K Vinaykumar and V. Ravi, Software Cost Estimation Using Soft Computing Approaches, Headbook of Research on Machine Learning Applications and Trends: Algorithms, Methods and Techniques, pp. 4999--518, 2010.Google ScholarGoogle Scholar
  16. E. Kocaguneli, T. Menzies and J. W. Keung, On the Value of Ensemble Effort Estimation, IEEE Transaction on Software Engineering, vol. 38, no. 6, Nov. 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. L. C. Briand and J. Wust, Empirical studies of quality models in object-oriented systems. Elsevier, Vol 56:98--167, 2002Google ScholarGoogle Scholar
  18. C. Catal and B. Diri, A systematic review of software fault prediction studies, Journal of Expert Systems with Applications, Vol 36 Issue 4, pp. 7346--7354 May 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. S. S. Rathore and A. Gupta, Investigating object-oriented design metrics to predict fault-proneness of software modules, Proceeding os of 6th CSI international conference on software engineering, pp 1--10, 2012.Google ScholarGoogle Scholar
  20. R. Malhotra and A. Jain, Fault prediction using statistical and machine learning methods for improving software quality, Journal of Information Processing Systems, Vol.8, No.2, June 2012.Google ScholarGoogle Scholar
  21. Y. Singh, A. Kaur and R. Malhotra, Application of logistic regression and artificial neural network for predicting software quality models. Proceeding of PROFES, 2008.Google ScholarGoogle Scholar
  22. Y. Singh A. Kaur A and R. Malhotra R, Empirical validation of object-oriented metrics for predicting fault proneness models. Journal of Software Quality, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. R. Baeza-Yates and B. Ribeiro-Neto, Modern Information Retrieval. Reading, MA, USA: Addison-Wesley, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. P. Singh and S. Verma, Empirical investigation of fault prediction capability of object oriented metrics of open source software. Proceeding of the international joint conference on CS and SE, IEEE, pp 323--327, 2012Google ScholarGoogle Scholar
  25. K. M. Ali and M. J. Pazzani, Error Reduction through learning Multiple Descriptiond, TR 95-39, Deaprtment of ICS University of Ca, Irvine, 1995.Google ScholarGoogle Scholar
  26. L. K. Hansen and P. Salamon, Neural Network Ensembles, IEEE Transaction on Pattern Analysis and Machine Intelligence, Vol 12, pp. 993--1001, 1990 Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. S. Dezeroski and and B. Zenko. Is Combining Classifiers with Stacking Better than Selecting the Best One, Kluwer Academics Publishers, Manufactured in The Netherland, 54, 255--273, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. D. Opitz and R. Maclin, Popular Ensemble Methods: An Empirical Study, Journal of Artificial Research, 1 1: 169--198, 1999.Google ScholarGoogle Scholar
  29. A. J. C. Sharkey, Types of multinet system, in: Proc. Int. Workshop on Multiple Classifier Systems (LNCS 2364), Springer, Calgiari, Italy, 2002, pp. 108--117. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. T. K. Ho, Data complexity analysis for classifier combination, in: Proc. Int. Workshop on Multiple Classifier Systems (LNCS 2096), Springer, Cambridge, UK, 2001, pp. 53--67 Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. m L. I. Kuncheva, Combining Pattern Classifiers, Wiley Press 2005.Google ScholarGoogle Scholar
  32. G. Valentini and F. Masulli, Ensembles of learning machines. In R. Tagliaferri and M. Marinaro, editors, Neural Nets, WIRN, Vol. 2486 of Lecture Notes in ComputerScience, Springer, pp. 3--19, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Performance Evaluation of Ensemble Methods For Software Fault Prediction: An Experiment

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      ASWEC ' 15 Vol. II: Proceedings of the ASWEC 2015 24th Australasian Software Engineering Conference
      September 2015
      171 pages
      ISBN:9781450337960
      DOI:10.1145/2811681

      Copyright © 2015 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 28 September 2015

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • short-paper
      • Research
      • Refereed limited

      Acceptance Rates

      ASWEC ' 15 Vol. II Paper Acceptance Rate12of27submissions,44%Overall Acceptance Rate12of27submissions,44%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader