ABSTRACT
Used to establish confidence in the correctness of evolving software, regression testing is an important, yet costly, task. Test case prioritization enables the rapid detection of faults during regression testing by reordering the test suite so that effective tests are run as early as is possible. However, a distinct lack of information about the regression faults found in complex real-world software forced prior experimental studies of these methods to use artificial faults called mutants. Using the Defects4J database of real faults, this paper presents the results of experiments evaluating the effectiveness of four representative test prioritization techniques. Since this paper's results show that prioritization is susceptible to high amounts of variance when only one fault is present, our experiments also control the number of real faults and mutants in the program subject to regression testing. Our overall findings are that, in comparison to mutants, real faults are harder for reordered test suites to quickly detect, suggesting that mutants are not a surrogate for real faults.
- 2017. JUnit test execution order. (2017). Retrieved 02/03/2018 from https://github.com/junit-team/junit4/wiki/Test-execution-orderGoogle Scholar
- 2018. Experimental data from this paper's evaluation. (2018). https://www.bitbucket.com/testprioritisation/ast2018_dataGoogle Scholar
- 2018. Kanonizo. (2018). https://github.com/kanonizo/kanonizoGoogle Scholar
- J. H. Andrews, L. C. Briand, and Y. Labiche. 2005. Is Mutation an Appropriate Tool for Testing Experiments?. In Proceedings of the 27th International Conference on Software Engineering. Google ScholarDigital Library
- Marcel Böhme, Ezekiel O. Soremekun, Sudipta Chattopadhyay, Emamurho Ugherughe, and Andreas Zeller. 2017. Where is the Bug and How is It Fixed? An Experiment with Practitioners. In Proceedings of the 11th Joint Meeting on Foundations of Software Engineering. Google ScholarDigital Library
- Ryan Carlson, Hyunsook Do, and Anne Denton. 2011. A Clustering Approach to Improving Test Case Prioritization: An Industrial Case Study. In Proceedings of the 27th International Conference on Software Maintenance. Google ScholarDigital Library
- Alexander P. Conrad, Robert S. Roos, and Gregory M. Kapfhammer. 2010. Empirically studying the role of selection operators during search-based test suite prioritization. In Proceedings of the 12th International Conference on Genetic and Evolutionary Computation. Google ScholarDigital Library
- Daniel Di Nardo, Nadia Alshahwan, Lionel Briand, and Yvan Labiche. 2015. Coverage-based regression test case selection, minimization and prioritization: a case study on an industrial system. Journal of Software Testing, Verification and Reliability 25, 4 (2015). Google ScholarDigital Library
- Hyunsook Do and Gregg Rothermel. 2006. On the Use of Mutation Faults in Empirical Assessments of Test Case Prioritization Techniques. Transactions on Software Engineering 32, 9 (2006). Google ScholarDigital Library
- Sebastian Elbaum, Alexey G. Malishevsky, and Gregg Rothermel. 2000. Prioritizing Test Cases for Regression Testing. In Proceedings of the International Symposium on Software Testing and Analysis. Google ScholarDigital Library
- S. M. Ellis and H. S. Steyn. 2003. Practical significance (effect sizes) versus or in combination with statistical significance (p-values). Management Dynamics 12, 4 (2003).Google Scholar
- D. Hao, L. Zhang, L. Zang, Y. Wang, X. Wu, and T. Xie. 2016. To Be Optimal or Not in Test-Case Prioritization. Transactions on Software Engineering 42, 5 (2016).Google ScholarCross Ref
- Laura Inozemtseva and Reid Holmes. 2014. Coverage is Not Strongly Correlated with Test Suite Effectiveness. In Proceedings of the 36th International Conference on Software Engineering. Google ScholarDigital Library
- René Just, Darioush Jalali, and Michael D. Ernst. 2014. Defects4J: A Database of Existing Faults to Enable Controlled Testing Studies for Java Programs. In Proceedings of the International Symposium on Software Testing and Analysis. Google ScholarDigital Library
- René Just, Darioush Jalali, Laura Inozemtseva, Michael D. Ernst, Reid Holmes, and Gordon Fraser. 2014. Are Mutants a Valid Substitute for Real Faults in Software Testing?. In Proceedings of the 22nd International Symposium on Foundations of Software Engineering. Google ScholarDigital Library
- René Just, Gregory M. Kapfhammer, and Franz Schweiggert. 2012. Using non-redundant mutation operators and test suite prioritization to achieve efficient and scalable mutation analysis. In Proceedings of the 23rd International Symposium on Software Reliability Engineering. Google ScholarDigital Library
- Gregory M. Kapfhammer. 2004. Software testing. In The Computer Science Handbook.Google Scholar
- Gregory M. Kapfhammer. 2010. Regression testing. In The Encyclopedia of Software Engineering.Google Scholar
- David Leon and Andy Podgurski. 2003. A Comparison of Coverage-Based and Distribution-Based Techniques for Filtering and Prioritizing Test Cases. In Proceedings of the 14th International Symposium on Software Reliability Engineering. Google ScholarDigital Library
- Z. Li, M. Harman, and R. M. Hierons. 2007. Search Algorithms for Regression Test Case Prioritization. Transactions on Software Engineering 33, 4 (2007). Google ScholarDigital Library
- Qi Luo, Kevin Moran, and Denys Poshyvanyk. 2016. A Large-scale Empirical Comparison of Static and Dynamic Test Case Prioritization Techniques. In Proceedings of the 24th International Symposium on Foundations of Software Engineering. Google ScholarDigital Library
- Alexey G. Malishevsky, Joseph R. Ruthruff, Gregg Rothermel, and Sebastian Elbaum. 2006. Cost-cognizant test case prioritization. Technical Report TR-UNL-CSE-2006-0004. Department of Computer Science and Engineering, University of Nebraska, Lincoln, Nebraska, USA.Google Scholar
- X. Qu, M. B. Cohen, and K. M. Woolf. 2007. Combinatorial Interaction Regression Testing: A Study of Test Case Generation and Prioritization. In Proceedings of the 23rd International Conference on Software Maintenance.Google Scholar
- Apache Geode Nightly Test Report. 2018. Apache Geode Nightly Test Report. (2018). https://builds.apache.org/view/E-G/view/Geode/job/Geode-release/lastCompletedBuild/testReport/Google Scholar
- G. Rothermel, R. H. Untch, Chengyun Chu, and M. J. Harrold. 1999. Test case prioritization: an empirical study. In International Conference on Software Maintenance. Google ScholarDigital Library
- G. Rothermel, R. H. Untch, Chengyun Chu, and M. J. Harrold. 2001. Prioritizing test cases for regression testing. Transactions on Software Engineering 27, 10 (2001). Google ScholarDigital Library
- D. Shin, S. Yoo, M. Papadakis, and D.-H. Bae. 2017. Empirical Evaluation of Mutation-based Test Prioritization Techniques. ArXiv e-prints (2017). arXiv:cs.SE/1709.04631Google Scholar
- Amitabh Srivastava and Jay Thiagarajan. 2002. Effectively Prioritizing Tests in Development Environment. In Proceedings of the International Symposium on Software Testing and Analysis. Google ScholarDigital Library
- S. W. Thomas, H. Hemmati, A. E. Hassan, and D. Blostein. 2014. Static test case prioritization using topic models. Journal of Empirical Software Engineering 19(1) (2014). Google ScholarDigital Library
- András Vargha and Harold D. Delaney. 2000. A Critique and Improvement of the CL Common Language Effect Size Statistics of McGraw and Wong. Journal of Education and Behavioral Statistics 25, 2 (2000).Google Scholar
- Kristen R. Walcott, Mary Lou Soffa, Gregory M. Kapfhammer, and Robert S. Roos. 2006. Time-aware Test Suite Prioritization. In Proceedings of the International Symposium on Software Testing and Analysis. Google ScholarDigital Library
- S. Yoo and M. Harman. 2012. Regression Testing Minimization, Selection and Prioritization: A Survey. Journal of Software Testing, Verification and Reliability 22, 2 (2012). Google ScholarDigital Library
- Sai Zhang, Darioush Jalali, Jochen Wuttke, Kivanç Muşlu, Wing Lam, Michael D. Ernst, and David Notkin. 2014. Empirically Revisiting the Test Independence Assumption. In Proceedings of the International Symposium on Software Testing and Analysis. Google ScholarDigital Library
Recommendations
Are mutants a valid substitute for real faults in software testing?
FSE 2014: Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software EngineeringA good test suite is one that detects real faults. Because the set of faults in a program is usually unknowable, this definition is not useful to practitioners who are creating test suites, nor to researchers who are creating and evaluating tools that ...
On the Use of Mutation Faults in Empirical Assessments of Test Case Prioritization Techniques
Regression testing is an important activity in the software life cycle, but it can also be very expensive. To reduce the cost of regression testing, software testers may prioritize their test cases so that those which are more important, by some measure,...
A Controlled Experiment Assessing Test Case Prioritization Techniques via Mutation Faults
ICSM '05: Proceedings of the 21st IEEE International Conference on Software MaintenanceRegression testing is an important part of software maintenance, but it can also be very expensive. To reduce this expense, software testers may prioritize their test cases so that those that are more important are run earlier in the regression testing ...
Comments