Abstract
Localizing failure-inducing code is essential for software debugging. Manual fault localization can be quite tedious, error-prone, and time-consuming. Therefore, a huge body of research e orts have been dedicated to automated fault localization. Spectrum-based fault localization, the most intensively studied fault localization approach based on test execution information, may have limited effectiveness, since a code element executed by a failed tests may not necessarily have impact on the test outcome and cause the test failure. To bridge the gap, mutation-based fault localization has been proposed to transform the programs under test to check the impact of each code element for better fault localization. However, there are limited studies on the effectiveness of mutation-based fault localization on sufficient number of real bugs. In this paper, we perform an extensive study to compare mutation-based fault localization techniques with various state-of-the-art spectrum-based fault localization techniques on 357 real bugs from the Defects4J benchmark suite. The study results firstly demonstrate the effectiveness of mutation-based fault localization, as well as revealing a number of guidelines for further improving mutation-based fault localization. Based on the learnt guidelines, we further transform test outputs/messages and test code to obtain various mutation information. Then, we propose TraPT, an automated Learning-to-Rank technique to fully explore the obtained mutation information for effective fault localization. The experimental results show that TraPT localizes 65.12% and 94.52% more bugs within Top-1 than state-of-the-art mutation and spectrum based techniques when using the default setting of LIBSVM.
- Rui Abreu, Peter Zoeteweij, and Arjan JC Van Gemund. 2006. An evaluation of similarity coe�cients for software fault localization. In Dependable Computing, 2006. PRDC’06. 12th Paci�c Rim International Symposium on. 39–46.Google ScholarDigital Library
- Rui Abreu, Peter Zoeteweij, and Arjan JC Van Gemund. 2007. On the accuracy of spectrum-based fault localization. In Testing: Academic and Industrial Conference Practice and Research Techniques-MUTATION, 2007. TAICPART-MUTATION 2007. IEEE, 89–98.Google ScholarDigital Library
- Shay Artzi, Julian Dolby, Frank Tip, and Marco Pistoia. 2010. Directed test generation for e�ective fault localization. In Proceedings of the 19th international symposium on Software testing and analysis. ACM, 49–60.Google ScholarDigital Library
- Tien-Duy B Le, David Lo, Claire Le Goues, and Lars Grunske. 2016. A learning-to-rank based fault localization approach using likely invariants. In Proceedings of the 25th International Symposium on Software Testing and Analysis. ACM, 177–188.Google ScholarDigital Library
- George K Baah, Andy Podgurski, and Mary Jean Harrold. 2011. Mitigating the confounding e�ects of program dependences for e�ective fault localization. In Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering. ACM, 146–156.Google ScholarDigital Library
- Chris Burges, Tal Shaked, Erin Renshaw, Ari Lazier, Matt Deeds, Nicole Hamilton, and Greg Hullender. 2005. Learning to rank using gradient descent. In Proceedings of the 22nd international conference on Machine learning. ACM, 89–96. Google ScholarDigital Library
- Christopher JC Burges, Robert Ragno, and Quoc Viet Le. 2006. Learning to rank with nonsmooth cost functions. In NIPS, Vol. 6. 193–200.Google Scholar
- Chih-Chung Chang and Chih-Jen Lin. 2011. LIBSVM: a library for support vector machines. ACM Transactions on Intelligent Systems and Technology (TIST) 2, 3 (2011), 27.Google ScholarDigital Library
- Patrick Daniel, Kwan Yong Sim, and Soonuk Seol. 2014. Improving Spectrum-based Fault-localization through Spectra Cloning for Fail Test Cases Beyond Balanced Test Suite. Contemporary Engineering Sciences 7 (2014), 677–682. Google ScholarCross Ref
- Tung Dao, Lingming Zhang, and Na Meng. 2017. How does execution information help with information-retrieval based bug localization?. In Proceedings of the 25th International Conference on Program Comprehension. 241–250. Google ScholarDigital Library
- Richard A DeMillo, Richard J Lipton, and Frederick G Sayward. 1978. Hints on test data selection: Help for the practicing programmer. Computer 4 (1978), 34–41.Google ScholarDigital Library
- Giovanni Denaro, Alessandro Margara, Mauro Pezze, and Mattia Vivanti. 2015. Dynamic data �ow testing of object oriented systems. In Proceedings of the 37th International Conference on Software Engineering-Volume 1. IEEE Press, 947–958.Google Scholar
- Görschwin Fey, Stefan Staber, Roderick Bloem, and Rolf Drechsler. 2008. Automatic fault localization for property checking. Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on 27, 6 (2008), 1138–1149.Google ScholarDigital Library
- Yoav Freund, Raj Iyer, Robert E Schapire, and Yoram Singer. 2003. An e�cient boosting algorithm for combining preferences. Journal of machine learning research 4, Nov (2003), 933–969.Google Scholar
- Ellen R Girden. 1992. ANOVA: Repeated measures. Number 84. Sage.Google ScholarCross Ref
- Liang Gong, Daniel Lo, Lingxiao Jiang, and Hongyu Zhang. 2012. Diversity maximization speedup for fault localization. In Automated Software Engineering (ASE), 2012 Proceedings of the 27th IEEE/ACM International Conference on. IEEE, 30–39. Google ScholarDigital Library
- Divya Gopinath, Razieh Nokhbeh Zaeem, and Sarfraz Khurshid. 2012. Improving the e�ectiveness of spectra-based fault localization using speci�cations. In Automated Software Engineering (ASE), 2012 Proceedings of the 27th IEEE/ACM International Conference on. IEEE, 40–49.Google Scholar
- Andreas Griesmayer, Stefan Staber, and Roderick Bloem. 2007. Automated fault localization for C programs. Electronic Notes in Theoretical Computer Science 174, 4 (2007), 95–111. Google ScholarDigital Library
- Richard G Hamlet. 1977. Testing programs with the aid of a compiler. Software Engineering, IEEE Transactions on 4 (1977), 279–290.Google ScholarDigital Library
- Schuyler W Huck and Robert A McLean. 1975. Using a repeated measures ANOVA to analyze the data from a pretest-posttest design: A potentially confusing task. Psychological Bulletin 82, 4 (1975), 511.Google ScholarCross Ref
- Yue Jia and Mark Harman. 2011. An analysis and survey of the development of mutation testing. Software Engineering, IEEE Transactions on 37, 5 (2011), 649–678.Google ScholarDigital Library
- James A Jones and Mary Jean Harrold. 2005. Empirical evaluation of the tarantula automatic fault-localization technique. In Proceedings of the 20th IEEE/ACM international Conference on Automated software engineering. ACM, 273–282.Google ScholarDigital Library
- René Just, Darioush Jalali, and Michael D. Ernst. 2014. Defects4J: A database of existing faults to enable controlled testing studies for Java programs. In Proceedings of the International Symposium on Software Testing and Analysis (ISSTA). San Jose, CA, USA, 437–440. Google ScholarDigital Library
- Pavneet Singh Kochhar, Xin Xia, David Lo, and Shanping Li. 2016. Practitioners’ expectations on automated fault localization. In Proceedings of the 25th International Symposium on Software Testing and Analysis. ACM, 165–176. Google ScholarDigital Library
- Tien-Duy B Le, Richard J Oentaryo, and David Lo. 2015. Information retrieval and spectrum based bug localization: better together. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering. 579–590.Google ScholarDigital Library
- Claire Le Goues, Michael Dewey-Vogt, Stephanie Forrest, and Westley Weimer. 2012. A systematic study of automated program repair: Fixing 55 out of 105 bugs for $8 each. In Software Engineering (ICSE), 2012 34th International Conference on. 3–13.Google ScholarCross Ref
- Ching-Pei Lee and Chih-Jen Lin. 2014. Large-scale linear ranksvm. Neural computation 26, 4 (2014), 781–817. Google ScholarDigital Library
- Ben Liblit, Mayur Naik, Alice X Zheng, Alex Aiken, and Michael I Jordan. 2005. Scalable statistical bug isolation. In ACM SIGPLAN Notices, Vol. 40. 15–26.Google ScholarDigital Library
- Tie-Yan Liu. 2009. Learning to rank for information retrieval. Foundations and Trends® in Information Retrieval 3, 3 (2009), 225–331.Google Scholar
- Fan Long and Martin Rinard. 2015. Staged program repair with condition synthesis. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering. 166–178. Google ScholarDigital Library
- Yafeng Lu, Yiling Lou, Shiyang Cheng, Lingming Zhang, Dan Hao, Yangfan Zhou, and Lu Zhang. 2016. How does regression test prioritization perform in real-world software evolution?. In Proceedings of the 38th International Conference on Software Engineering. 535–546. Google ScholarDigital Library
- Lucia Lucia, David Lo, Lingxiao Jiang, Ferdian Thung, and Aditya Budi. 2014. Extended comprehensive study of association measures for fault localization. Journal of Software: Evolution and Process 26, 2 (2014), 172–219. Google ScholarDigital Library
- Seokhyeon Moon, Yunho Kim, Moonzoo Kim, and Shin Yoo. 2014. Ask the mutants: Mutating faulty programs for fault localization. In Software Testing, Veri�cation and Validation (ICST), 2014 IEEE Seventh International Conference on. IEEE, 153–162.Google Scholar
- Jakub Možucha and Bruno Rossi. 2016. Is Mutation Testing Ready to Be Adopted Industry-Wide?. In Product-Focused Software Process Improvement: 17th International Conference, PROFES 2016, Trondheim, Norway, November 22-24, 2016, Proceedings 17. Springer, 217–232.Google ScholarCross Ref
- Vincenzo Musco, Martin Monperrus, and Philippe Preux. 2016. A large-scale study of call graph-based impact prediction using mutation testing. Software Quality Journal (2016), 1–30.Google Scholar
- Lee Naish, Hua Jie Lee, and Kotagiri Ramamohanarao. 2011. A model for spectra-based software diagnosis. ACM Transactions on software engineering and methodology (TOSEM) 20, 3 (2011), 11.Google Scholar
- Hoang Duong Thien Nguyen, Dawei Qi, Abhik Roychoudhury, and Satish Chandra. 2013. Sem�x: Program repair via semantic analysis. In Proceedings of the 2013 International Conference on Software Engineering. 772–781.Google Scholar
- A Je�erson O�utt, Gregg Rothermel, and Christian Zapf. 1993. An experimental evaluation of selective mutation. In Proceedings of the 15th international conference on Software Engineering. IEEE Computer Society Press, 100–107.Google Scholar
- Mike Papadakis and Yves Le Traon. 2012. Using mutants to locate" unknown" faults. In Software Testing, Veri�cation and Validation (ICST), 2012 IEEE Fifth International Conference on. IEEE, 691–700.Google Scholar
- Mike Papadakis and Yves Le Traon. 2014. E�ective fault localization via mutation analysis: A selective mutation approach. In Proceedings of the 29th Annual ACM Symposium on Applied Computing. ACM, 1293–1300.Google Scholar
- Mike Papadakis and Yves Le Traon. 2015. Metallaxis-FL: mutation-based fault localization. Software Testing, Veri�cation and Reliability 25, 5-7 (2015), 605–628.Google ScholarDigital Library
- Chris Parnin and Alessandro Orso. 2011. Are automated debugging techniques actually helping programmers?. In Proceedings of the 2011 International Symposium on Software Testing and Analysis. 199–209. Google ScholarDigital Library
- Spencer Pearson, José Campos, René Just, Gordon Fraser, Rui Abreu, Michael D Ernst, Deric Pang, and Benjamin Keller. 2017. Evaluating and improving fault localization. In Proceedings of the 39th International Conference on Software Engineering. 609–620. Google ScholarDigital Library
- Ripon K Saha, Matthew Lease, Sarfraz Khurshid, and Dewayne E Perry. 2013. Improving bug localization using structured information retrieval. In Automated Software Engineering (ASE), 2013 IEEE/ACM 28th International Conference on. 345–355.Google ScholarDigital Library
- Raul Santelices, James A Jones, Yanbing Yu, and Mary Jean Harrold. 2009. Lightweight fault-localization using multiple coverage types. In Proceedings of the 31st International Conference on Software Engineering. IEEE Computer Society, 56–66. Google ScholarDigital Library
- August Shi, Alex Gyori, Milos Gligoric, Andrey Zaytsev, and Darko Marinov. 2014. Balancing trade-o�s in test-suite reduction. In Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering. 246–256.Google ScholarDigital Library
- Ming-Feng Tsai, Tie-Yan Liu, Tao Qin, Hsin-Hsi Chen, and Wei-Ying Ma. 2007. FRank: a ranking method with �delity loss. In Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval. ACM, 383–390. Google ScholarDigital Library
- CARL N von Ende. 2001. Repeated-measures analysis. Design and analysis of ecological experiments. Oxford University Press, Oxford (2001), 134–157.Google Scholar
- Qianqian Wang, Chris Parnin, and Alessandro Orso. 2015. An In-Depth Study of IR-Based Fault Localization Techniques. In Proceedings of the International Symposium on Software Testing and Analysis. To appear.Google Scholar
- Frank Wilcoxon. 1945. Individual comparisons by ranking methods. Biometrics bulletin 1, 6 (1945), 80–83. Google ScholarCross Ref
- W Eric Wong, Yu Qi, Lei Zhao, and Kai-Yuan Cai. 2007. E�ective fault localization using code coverage. In Computer Software and Applications Conference, 2007. COMPSAC 2007. 31st Annual International, Vol. 1. IEEE, 449–456.Google Scholar
- Yingfei Xiong, Jie Wang, Runfa Yan, Jiachen Zhang, Shi Han, Gang Huang, and Lu Zhang. 2017. Precise condition synthesis for program repair. In Proceedings of the 39th International Conference on Software Engineering. 416–426. Google ScholarDigital Library
- Jifeng Xuan and Martin Monperrus. 2014. Learning to combine multiple ranking metrics for fault localization. In Software Maintenance and Evolution (ICSME), 2014 IEEE International Conference on. IEEE, 191–200. Google ScholarDigital Library
- Jie Zhang, Ziyi Wang, Lingming Zhang, Dan Hao, Lei Zang, Shiyang Cheng, and Lu Zhang. 2016. Predictive mutation testing. In Proceedings of the 25th International Symposium on Software Testing and Analysis. 342–353. Google ScholarDigital Library
- Lingming Zhang, Milos Gligoric, Darko Marinov, and Sarfraz Khurshid. 2013. Operator-based and random mutant selection: Better together. In Automated Software Engineering (ASE), 2013 IEEE/ACM 28th International Conference on. 92–102.Google ScholarDigital Library
- Lingming Zhang, Lu Zhang, and Sarfraz Khurshid. 2013. Injecting mechanical faults to localize developer faults for evolving software. In OOPSLA. 765–784. Google ScholarDigital Library
- Mengshi Zhang, Xia Li, Lingming Zhang, and Sarfraz Khurshid. 2017. Boosting spectrum-based fault localization using PageRank. In Proceedings of the 26th ACM SIGSOFT International Symposium on Software Testing and Analysis. 261–272. Google ScholarDigital Library
- Jian Zhou, Hongyu Zhang, and David Lo. 2012. Where should the bugs be �xed? more accurate information retrieval-based bug localization based on bug reports. In Software Engineering (ICSE), 2012 34th International Conference on. 14–24.Google ScholarCross Ref
Index Terms
- Transforming programs and tests in tandem for fault localization
Recommendations
Tester Feedback Driven Fault Localization
ICST '12: Proceedings of the 2012 IEEE Fifth International Conference on Software Testing, Verification and ValidationCoincidentally correct test cases are those that execute faulty statements but do not cause failures. Such test cases reduce the effectiveness of spectrum-based fault localization techniques, such as Ochiai, because the correlation of failure with the ...
Effective fault localization of automotive Simulink models: achieving the trade-off between test oracle effort and fault localization accuracy
One promising way to improve the accuracy of fault localization based on statistical debugging is to increase diversity among test cases in the underlying test suite. In many practical situations, adding test cases is not a cost-free option because test ...
Combining mutation and fault localization for automated program debugging
Combining mutation and software fault localization for automated bug-fixing.Using only carefully selected mutant operators.Generating mutants only with respect to the most suspicious statements.Fixing software bugs without human intervention.Examining ...
Comments