research-article

Using controlled numbers of real faults and mutants to empirically evaluate coverage-based test case prioritization

Authors:
David Paterson

University of Sheffield

University of Sheffield
View Profile

,
Gregory M. Kapfhammer

Allegheny College

Allegheny College
View Profile

,
Gordon Fraser

University of Passau

University of Passau
View Profile

,
Phil McMinn

University of Sheffield

University of Sheffield
View Profile

AST '18: Proceedings of the 13th International Workshop on Automation of Software TestMay 2018Pages 57–63https://doi.org/10.1145/3194733.3194735

Published:28 May 2018Publication History

AST '18: Proceedings of the 13th International Workshop on Automation of Software Test

Pages 57–63

ABSTRACT

Used to establish confidence in the correctness of evolving software, regression testing is an important, yet costly, task. Test case prioritization enables the rapid detection of faults during regression testing by reordering the test suite so that effective tests are run as early as is possible. However, a distinct lack of information about the regression faults found in complex real-world software forced prior experimental studies of these methods to use artificial faults called mutants. Using the Defects4J database of real faults, this paper presents the results of experiments evaluating the effectiveness of four representative test prioritization techniques. Since this paper's results show that prioritization is susceptible to high amounts of variance when only one fault is present, our experiments also control the number of real faults and mutants in the program subject to regression testing. Our overall findings are that, in comparison to mutants, real faults are harder for reordered test suites to quickly detect, suggesting that mutants are not a surrogate for real faults.

References

2017. JUnit test execution order. (2017). Retrieved 02/03/2018 from https://github.com/junit-team/junit4/wiki/Test-execution-orderGoogle Scholar
2018. Experimental data from this paper's evaluation. (2018). https://www.bitbucket.com/testprioritisation/ast2018_dataGoogle Scholar
2018. Kanonizo. (2018). https://github.com/kanonizo/kanonizoGoogle Scholar
J. H. Andrews, L. C. Briand, and Y. Labiche. 2005. Is Mutation an Appropriate Tool for Testing Experiments?. In Proceedings of the 27th International Conference on Software Engineering. Google ScholarDigital Library
Marcel Böhme, Ezekiel O. Soremekun, Sudipta Chattopadhyay, Emamurho Ugherughe, and Andreas Zeller. 2017. Where is the Bug and How is It Fixed? An Experiment with Practitioners. In Proceedings of the 11th Joint Meeting on Foundations of Software Engineering. Google ScholarDigital Library
Ryan Carlson, Hyunsook Do, and Anne Denton. 2011. A Clustering Approach to Improving Test Case Prioritization: An Industrial Case Study. In Proceedings of the 27th International Conference on Software Maintenance. Google ScholarDigital Library
Alexander P. Conrad, Robert S. Roos, and Gregory M. Kapfhammer. 2010. Empirically studying the role of selection operators during search-based test suite prioritization. In Proceedings of the 12th International Conference on Genetic and Evolutionary Computation. Google ScholarDigital Library
Daniel Di Nardo, Nadia Alshahwan, Lionel Briand, and Yvan Labiche. 2015. Coverage-based regression test case selection, minimization and prioritization: a case study on an industrial system. Journal of Software Testing, Verification and Reliability 25, 4 (2015). Google ScholarDigital Library
Hyunsook Do and Gregg Rothermel. 2006. On the Use of Mutation Faults in Empirical Assessments of Test Case Prioritization Techniques. Transactions on Software Engineering 32, 9 (2006). Google ScholarDigital Library
Sebastian Elbaum, Alexey G. Malishevsky, and Gregg Rothermel. 2000. Prioritizing Test Cases for Regression Testing. In Proceedings of the International Symposium on Software Testing and Analysis. Google ScholarDigital Library
S. M. Ellis and H. S. Steyn. 2003. Practical significance (effect sizes) versus or in combination with statistical significance (p-values). Management Dynamics 12, 4 (2003).Google Scholar
D. Hao, L. Zhang, L. Zang, Y. Wang, X. Wu, and T. Xie. 2016. To Be Optimal or Not in Test-Case Prioritization. Transactions on Software Engineering 42, 5 (2016).Google ScholarCross Ref
Laura Inozemtseva and Reid Holmes. 2014. Coverage is Not Strongly Correlated with Test Suite Effectiveness. In Proceedings of the 36th International Conference on Software Engineering. Google ScholarDigital Library
René Just, Darioush Jalali, and Michael D. Ernst. 2014. Defects4J: A Database of Existing Faults to Enable Controlled Testing Studies for Java Programs. In Proceedings of the International Symposium on Software Testing and Analysis. Google ScholarDigital Library
René Just, Darioush Jalali, Laura Inozemtseva, Michael D. Ernst, Reid Holmes, and Gordon Fraser. 2014. Are Mutants a Valid Substitute for Real Faults in Software Testing?. In Proceedings of the 22nd International Symposium on Foundations of Software Engineering. Google ScholarDigital Library
René Just, Gregory M. Kapfhammer, and Franz Schweiggert. 2012. Using non-redundant mutation operators and test suite prioritization to achieve efficient and scalable mutation analysis. In Proceedings of the 23rd International Symposium on Software Reliability Engineering. Google ScholarDigital Library
Gregory M. Kapfhammer. 2004. Software testing. In The Computer Science Handbook.Google Scholar
Gregory M. Kapfhammer. 2010. Regression testing. In The Encyclopedia of Software Engineering.Google Scholar
David Leon and Andy Podgurski. 2003. A Comparison of Coverage-Based and Distribution-Based Techniques for Filtering and Prioritizing Test Cases. In Proceedings of the 14th International Symposium on Software Reliability Engineering. Google ScholarDigital Library
Z. Li, M. Harman, and R. M. Hierons. 2007. Search Algorithms for Regression Test Case Prioritization. Transactions on Software Engineering 33, 4 (2007). Google ScholarDigital Library
Qi Luo, Kevin Moran, and Denys Poshyvanyk. 2016. A Large-scale Empirical Comparison of Static and Dynamic Test Case Prioritization Techniques. In Proceedings of the 24th International Symposium on Foundations of Software Engineering. Google ScholarDigital Library
Alexey G. Malishevsky, Joseph R. Ruthruff, Gregg Rothermel, and Sebastian Elbaum. 2006. Cost-cognizant test case prioritization. Technical Report TR-UNL-CSE-2006-0004. Department of Computer Science and Engineering, University of Nebraska, Lincoln, Nebraska, USA.Google Scholar
X. Qu, M. B. Cohen, and K. M. Woolf. 2007. Combinatorial Interaction Regression Testing: A Study of Test Case Generation and Prioritization. In Proceedings of the 23rd International Conference on Software Maintenance.Google Scholar
Apache Geode Nightly Test Report. 2018. Apache Geode Nightly Test Report. (2018). https://builds.apache.org/view/E-G/view/Geode/job/Geode-release/lastCompletedBuild/testReport/Google Scholar
G. Rothermel, R. H. Untch, Chengyun Chu, and M. J. Harrold. 1999. Test case prioritization: an empirical study. In International Conference on Software Maintenance. Google ScholarDigital Library
G. Rothermel, R. H. Untch, Chengyun Chu, and M. J. Harrold. 2001. Prioritizing test cases for regression testing. Transactions on Software Engineering 27, 10 (2001). Google ScholarDigital Library
D. Shin, S. Yoo, M. Papadakis, and D.-H. Bae. 2017. Empirical Evaluation of Mutation-based Test Prioritization Techniques. ArXiv e-prints (2017). arXiv:cs.SE/1709.04631Google Scholar
Amitabh Srivastava and Jay Thiagarajan. 2002. Effectively Prioritizing Tests in Development Environment. In Proceedings of the International Symposium on Software Testing and Analysis. Google ScholarDigital Library
S. W. Thomas, H. Hemmati, A. E. Hassan, and D. Blostein. 2014. Static test case prioritization using topic models. Journal of Empirical Software Engineering 19(1) (2014). Google ScholarDigital Library
András Vargha and Harold D. Delaney. 2000. A Critique and Improvement of the CL Common Language Effect Size Statistics of McGraw and Wong. Journal of Education and Behavioral Statistics 25, 2 (2000).Google Scholar
Kristen R. Walcott, Mary Lou Soffa, Gregory M. Kapfhammer, and Robert S. Roos. 2006. Time-aware Test Suite Prioritization. In Proceedings of the International Symposium on Software Testing and Analysis. Google ScholarDigital Library
S. Yoo and M. Harman. 2012. Regression Testing Minimization, Selection and Prioritization: A Survey. Journal of Software Testing, Verification and Reliability 22, 2 (2012). Google ScholarDigital Library
Sai Zhang, Darioush Jalali, Jochen Wuttke, Kivanç Muşlu, Wing Lam, Michael D. Ernst, and David Notkin. 2014. Empirically Revisiting the Test Independence Assumption. In Proceedings of the International Symposium on Software Testing and Analysis. Google ScholarDigital Library

Recommendations

Are mutants a valid substitute for real faults in software testing?
FSE 2014: Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering

A good test suite is one that detects real faults. Because the set of faults in a program is usually unknowable, this definition is not useful to practitioners who are creating test suites, nor to researchers who are creating and evaluating tools that ...
Read More
On the Use of Mutation Faults in Empirical Assessments of Test Case Prioritization Techniques

Regression testing is an important activity in the software life cycle, but it can also be very expensive. To reduce the cost of regression testing, software testers may prioritize their test cases so that those which are more important, by some measure,...
Read More
A Controlled Experiment Assessing Test Case Prioritization Techniques via Mutation Faults
ICSM '05: Proceedings of the 21st IEEE International Conference on Software Maintenance

Regression testing is an important part of software maintenance, but it can also be very expensive. To reduce this expense, software testers may prioritize their test cases so that those that are more important are run earlier in the regression testing ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
AST '18: Proceedings of the 13th International Workshop on Automation of Software Test
May 2018
85 pages
ISBN:9781450357432
DOI:10.1145/3194733
Program Chairs:
Xiaoying Bai
Tsinghua University, China
,
J. Jenny Li
Kean University, NJ
,
Andreas Ulrich
Siemens AG, Germany
Copyright © 2018 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 28 May 2018
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- research-article
Conference

Upcoming Conference

ICSE 2025

2025 IEEE/ACM 46th International Conference on Software Engineering

April 26 - May 3, 2025

Ottawa , ON , Canada
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 7
  Total Citations
  View Citations
- 149
  Total Downloads
- Downloads (Last 12 months)8
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Using controlled numbers of real faults and mutants to empirically evaluate coverage-based test case prioritization

AST '18: Proceedings of the 13th International Workshop on Automation of Software Test

ABSTRACT

References

Cited By

Recommendations

Are mutants a valid substitute for real faults in software testing?

On the Use of Mutation Faults in Empirical Assessments of Test Case Prioritization Techniques

A Controlled Experiment Assessing Test Case Prioritization Techniques via Mutation Faults

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Using controlled numbers of real faults and mutants to empirically evaluate coverage-based test case prioritization

AST '18: Proceedings of the 13th International Workshop on Automation of Software Test

ABSTRACT

References

Cited By

Recommendations

Are mutants a valid substitute for real faults in software testing?

On the Use of Mutation Faults in Empirical Assessments of Test Case Prioritization Techniques

A Controlled Experiment Assessing Test Case Prioritization Techniques via Mutation Faults

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media