skip to main content
10.1145/3194733.3194735acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
research-article

Using controlled numbers of real faults and mutants to empirically evaluate coverage-based test case prioritization

Published:28 May 2018Publication History

ABSTRACT

Used to establish confidence in the correctness of evolving software, regression testing is an important, yet costly, task. Test case prioritization enables the rapid detection of faults during regression testing by reordering the test suite so that effective tests are run as early as is possible. However, a distinct lack of information about the regression faults found in complex real-world software forced prior experimental studies of these methods to use artificial faults called mutants. Using the Defects4J database of real faults, this paper presents the results of experiments evaluating the effectiveness of four representative test prioritization techniques. Since this paper's results show that prioritization is susceptible to high amounts of variance when only one fault is present, our experiments also control the number of real faults and mutants in the program subject to regression testing. Our overall findings are that, in comparison to mutants, real faults are harder for reordered test suites to quickly detect, suggesting that mutants are not a surrogate for real faults.

References

  1. 2017. JUnit test execution order. (2017). Retrieved 02/03/2018 from https://github.com/junit-team/junit4/wiki/Test-execution-orderGoogle ScholarGoogle Scholar
  2. 2018. Experimental data from this paper's evaluation. (2018). https://www.bitbucket.com/testprioritisation/ast2018_dataGoogle ScholarGoogle Scholar
  3. 2018. Kanonizo. (2018). https://github.com/kanonizo/kanonizoGoogle ScholarGoogle Scholar
  4. J. H. Andrews, L. C. Briand, and Y. Labiche. 2005. Is Mutation an Appropriate Tool for Testing Experiments?. In Proceedings of the 27th International Conference on Software Engineering. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Marcel Böhme, Ezekiel O. Soremekun, Sudipta Chattopadhyay, Emamurho Ugherughe, and Andreas Zeller. 2017. Where is the Bug and How is It Fixed? An Experiment with Practitioners. In Proceedings of the 11th Joint Meeting on Foundations of Software Engineering. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Ryan Carlson, Hyunsook Do, and Anne Denton. 2011. A Clustering Approach to Improving Test Case Prioritization: An Industrial Case Study. In Proceedings of the 27th International Conference on Software Maintenance. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Alexander P. Conrad, Robert S. Roos, and Gregory M. Kapfhammer. 2010. Empirically studying the role of selection operators during search-based test suite prioritization. In Proceedings of the 12th International Conference on Genetic and Evolutionary Computation. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Daniel Di Nardo, Nadia Alshahwan, Lionel Briand, and Yvan Labiche. 2015. Coverage-based regression test case selection, minimization and prioritization: a case study on an industrial system. Journal of Software Testing, Verification and Reliability 25, 4 (2015). Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Hyunsook Do and Gregg Rothermel. 2006. On the Use of Mutation Faults in Empirical Assessments of Test Case Prioritization Techniques. Transactions on Software Engineering 32, 9 (2006). Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Sebastian Elbaum, Alexey G. Malishevsky, and Gregg Rothermel. 2000. Prioritizing Test Cases for Regression Testing. In Proceedings of the International Symposium on Software Testing and Analysis. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. S. M. Ellis and H. S. Steyn. 2003. Practical significance (effect sizes) versus or in combination with statistical significance (p-values). Management Dynamics 12, 4 (2003).Google ScholarGoogle Scholar
  12. D. Hao, L. Zhang, L. Zang, Y. Wang, X. Wu, and T. Xie. 2016. To Be Optimal or Not in Test-Case Prioritization. Transactions on Software Engineering 42, 5 (2016).Google ScholarGoogle ScholarCross RefCross Ref
  13. Laura Inozemtseva and Reid Holmes. 2014. Coverage is Not Strongly Correlated with Test Suite Effectiveness. In Proceedings of the 36th International Conference on Software Engineering. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. René Just, Darioush Jalali, and Michael D. Ernst. 2014. Defects4J: A Database of Existing Faults to Enable Controlled Testing Studies for Java Programs. In Proceedings of the International Symposium on Software Testing and Analysis. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. René Just, Darioush Jalali, Laura Inozemtseva, Michael D. Ernst, Reid Holmes, and Gordon Fraser. 2014. Are Mutants a Valid Substitute for Real Faults in Software Testing?. In Proceedings of the 22nd International Symposium on Foundations of Software Engineering. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. René Just, Gregory M. Kapfhammer, and Franz Schweiggert. 2012. Using non-redundant mutation operators and test suite prioritization to achieve efficient and scalable mutation analysis. In Proceedings of the 23rd International Symposium on Software Reliability Engineering. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Gregory M. Kapfhammer. 2004. Software testing. In The Computer Science Handbook.Google ScholarGoogle Scholar
  18. Gregory M. Kapfhammer. 2010. Regression testing. In The Encyclopedia of Software Engineering.Google ScholarGoogle Scholar
  19. David Leon and Andy Podgurski. 2003. A Comparison of Coverage-Based and Distribution-Based Techniques for Filtering and Prioritizing Test Cases. In Proceedings of the 14th International Symposium on Software Reliability Engineering. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Z. Li, M. Harman, and R. M. Hierons. 2007. Search Algorithms for Regression Test Case Prioritization. Transactions on Software Engineering 33, 4 (2007). Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Qi Luo, Kevin Moran, and Denys Poshyvanyk. 2016. A Large-scale Empirical Comparison of Static and Dynamic Test Case Prioritization Techniques. In Proceedings of the 24th International Symposium on Foundations of Software Engineering. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Alexey G. Malishevsky, Joseph R. Ruthruff, Gregg Rothermel, and Sebastian Elbaum. 2006. Cost-cognizant test case prioritization. Technical Report TR-UNL-CSE-2006-0004. Department of Computer Science and Engineering, University of Nebraska, Lincoln, Nebraska, USA.Google ScholarGoogle Scholar
  23. X. Qu, M. B. Cohen, and K. M. Woolf. 2007. Combinatorial Interaction Regression Testing: A Study of Test Case Generation and Prioritization. In Proceedings of the 23rd International Conference on Software Maintenance.Google ScholarGoogle Scholar
  24. Apache Geode Nightly Test Report. 2018. Apache Geode Nightly Test Report. (2018). https://builds.apache.org/view/E-G/view/Geode/job/Geode-release/lastCompletedBuild/testReport/Google ScholarGoogle Scholar
  25. G. Rothermel, R. H. Untch, Chengyun Chu, and M. J. Harrold. 1999. Test case prioritization: an empirical study. In International Conference on Software Maintenance. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. G. Rothermel, R. H. Untch, Chengyun Chu, and M. J. Harrold. 2001. Prioritizing test cases for regression testing. Transactions on Software Engineering 27, 10 (2001). Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. D. Shin, S. Yoo, M. Papadakis, and D.-H. Bae. 2017. Empirical Evaluation of Mutation-based Test Prioritization Techniques. ArXiv e-prints (2017). arXiv:cs.SE/1709.04631Google ScholarGoogle Scholar
  28. Amitabh Srivastava and Jay Thiagarajan. 2002. Effectively Prioritizing Tests in Development Environment. In Proceedings of the International Symposium on Software Testing and Analysis. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. S. W. Thomas, H. Hemmati, A. E. Hassan, and D. Blostein. 2014. Static test case prioritization using topic models. Journal of Empirical Software Engineering 19(1) (2014). Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. András Vargha and Harold D. Delaney. 2000. A Critique and Improvement of the CL Common Language Effect Size Statistics of McGraw and Wong. Journal of Education and Behavioral Statistics 25, 2 (2000).Google ScholarGoogle Scholar
  31. Kristen R. Walcott, Mary Lou Soffa, Gregory M. Kapfhammer, and Robert S. Roos. 2006. Time-aware Test Suite Prioritization. In Proceedings of the International Symposium on Software Testing and Analysis. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. S. Yoo and M. Harman. 2012. Regression Testing Minimization, Selection and Prioritization: A Survey. Journal of Software Testing, Verification and Reliability 22, 2 (2012). Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Sai Zhang, Darioush Jalali, Jochen Wuttke, Kivanç Muşlu, Wing Lam, Michael D. Ernst, and David Notkin. 2014. Empirically Revisiting the Test Independence Assumption. In Proceedings of the International Symposium on Software Testing and Analysis. Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Conferences
    AST '18: Proceedings of the 13th International Workshop on Automation of Software Test
    May 2018
    85 pages
    ISBN:9781450357432
    DOI:10.1145/3194733

    Copyright © 2018 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 28 May 2018

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article

    Upcoming Conference

    ICSE 2025

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader