ABSTRACT
Learning-based fault localization has been intensively studied recently. Prior studies have shown that traditional Learning-to-Rank techniques can help precisely diagnose fault locations using various dimensions of fault-diagnosis features, such as suspiciousness values computed by various off-the-shelf fault localization techniques. However, with the increasing dimensions of features considered by advanced fault localization techniques, it can be quite challenging for the traditional Learning-to-Rank algorithms to automatically identify effective existing/latent features. In this work, we propose DeepFL, a deep learning approach to automatically learn the most effective existing/latent features for precise fault localization. Although the approach is general, in this work, we collect various suspiciousness-value-based, fault-proneness-based and textual-similarity-based features from the fault localization, defect prediction and information retrieval areas, respectively. DeepFL has been studied on 395 real bugs from the widely used Defects4J benchmark. The experimental results show DeepFL can significantly outperform state-of-the-art TraPT/FLUCCS (e.g., localizing 50+ more faults within Top-1). We also investigate the impacts of deep model configurations (e.g., loss functions and epoch settings) and features. Furthermore, DeepFL is also surprisingly effective for cross-project prediction.
- S. Planning, “The economic impacts of inadequate infrastructure for software testing,” 2002.Google Scholar
- L. Zhang, M. Kim, and S. Khurshid, “Localizing failure-inducing program edits based on spectrum information,” in ICSM, 2011, pp. 23–32. Google ScholarDigital Library
- X. Zhang, H. He, N. Gupta, and R. Gupta, “Experimental evaluation of using dynamic slices for fault location,” in Proceedings of the sixth international symposium on Automated analysis-driven debugging. ACM, 2005, pp. 33–42. Google ScholarDigital Library
- X. Zhang, N. Gupta, and R. Gupta, “Locating faulty code by multiple points slicing,” Software: Practice and Experience, vol. 37, no. 9, pp. 935–961, 2007. Google ScholarDigital Library
- R. Abreu, P. Zoeteweij, and A. J. Van Gemund, “On the accuracy of spectrumbased fault localization,” in Testing: Academic and Industrial Conference Practice and Research Techniques-MUTATION, 2007. TAICPART-MUTATION 2007. IEEE, 2007, pp. 89–98. Google ScholarDigital Library
- P. S. Kochhar, X. Xia, D. Lo, and S. Li, “Practitioners’ expectations on automated fault localization,” in Proceedings of the 25th International Symposium on Software Testing and Analysis. ACM, 2016, pp. 165–176. Google ScholarDigital Library
- X. Li, M. dâĂŹAmorim, and A. Orso, “Iterative user-driven fault localization,” in Haifa Verification Conference. Springer, 2016, pp. 82–98.Google ScholarCross Ref
- T.-D. B Le, D. Lo, C. Le Goues, and L. Grunske, “A learning-to-rank based fault localization approach using likely invariants,” in Proceedings of the 25th International Symposium on Software Testing and Analysis. ACM, 2016, pp. 177–188. Google ScholarDigital Library
- J. Xuan and M. Monperrus, “Learning to combine multiple ranking metrics for fault localization,” in Software Maintenance and Evolution (ICSME), 2014 IEEE International Conference on. IEEE, 2014, pp. 191–200. Google ScholarDigital Library
- S. Pearson, J. Campos, R. Just, G. Fraser, R. Abreu, M. D. Ernst, D. Pang, and B. Keller, “Evaluating and improving fault localization,” in Proceedings of the 39th International Conference on Software Engineering, 2017, pp. 609–620. Google ScholarDigital Library
- A. Ghanbari, S. Benton, and L. Zhang, “Practical program repair via bytecode mutation,” in Proceedings of the ACM SIGSOFT International Symposium on Software Testing and Analysis, 2019, to appear. Google ScholarDigital Library
- M. Martinez, T. Durieux, J. Xuan, R. Sommerard, and M. Monperrus, “Automatic repair of real bugs: An experience report on the defects4j dataset,” arXiv preprint arXiv:1505.07002, 2015.Google Scholar
- M. Martinez and M. Monperrus, “Astor: a program repair library for java,” in Proceedings of the 25th International Symposium on Software Testing and Analysis. ACM, 2016, pp. 441–444. Google ScholarDigital Library
- C. Le Goues, M. Dewey-Vogt, S. Forrest, and W. Weimer, “A systematic study of automated program repair: Fixing 55 out of 105 bugs for $8 each,” in Software Engineering (ICSE), 2012 34th International Conference on, 2012, pp. 3–13. Google ScholarDigital Library
- H. D. T. Nguyen, D. Qi, A. Roychoudhury, and S. Chandra, “Semfix: Program repair via semantic analysis,” in Proceedings of the 2013 International Conference on Software Engineering. IEEE Press, 2013, pp. 772–781. Google ScholarDigital Library
- J. Yi, U. Z. Ahmed, A. Karkare, S. H. Tan, and A. Roychoudhury, “A feasibility study of using automated program repair for introductory programming assignments,” in Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering. ACM, 2017, pp. 740–751. Google ScholarDigital Library
- S. H. Tan, J. Yi, S. Mechtaev, A. Roychoudhury et al., “Codeflaws: a programming competition benchmark for evaluating automated program repair tools,” in Proceedings of the 39th International Conference on Software Engineering Companion. IEEE Press, 2017, pp. 180–182. Google ScholarDigital Library
- W. E. Wong, R. Gao, Y. Li, R. Abreu, and F. Wotawa, “A survey on software fault localization,” IEEE Transactions on Software Engineering, vol. 42, no. 8, pp. 707–740, 2016. Google ScholarDigital Library
- J. A. Jones and M. J. Harrold, “Empirical evaluation of the tarantula automatic fault-localization technique,” in Proceedings of the 20th IEEE/ACM international Conference on Automated software engineering. ACM, 2005, pp. 273–282. Google ScholarDigital Library
- R. Abreu, P. Zoeteweij, and A. J. Van Gemund, “An evaluation of similarity coefficients for software fault localization,” in Dependable Computing, 2006. PRDC’06. 12th Pacific Rim International Symposium on, 2006, pp. 39–46. Google ScholarDigital Library
- W. E. Wong, V. Debroy, Y. Li, and R. Gao, “Software fault localization using dstar (d*),” in Software Security and Reliability (SERE), 2012 IEEE Sixth International Conference on. IEEE, 2012, pp. 21–30. Google ScholarDigital Library
- L. Naish, H. J. Lee, and K. Ramamohanarao, “A model for spectra-based software diagnosis,” ACM Transactions on software engineering and methodology (TOSEM), vol. 20, no. 3, p. 11, 2011. Google ScholarDigital Library
- S. Moon, Y. Kim, M. Kim, and S. Yoo, “Ask the mutants: Mutating faulty programs for fault localization,” in Software Testing, Verification and Validation (ICST), 2014 IEEE Seventh International Conference on. IEEE, 2014, pp. 153–162. Google ScholarDigital Library
- M. Papadakis and Y. Le Traon, “Using mutants to locate" unknown" faults,” in Software Testing, Verification and Validation (ICST), 2012 IEEE Fifth International Conference on. IEEE, 2012, pp. 691–700. Google ScholarDigital Library
- ——, “Effective fault localization via mutation analysis: A selective mutation approach,” in Proceedings of the 29th Annual ACM Symposium on Applied Computing. ACM, 2014, pp. 1293–1300. Google ScholarDigital Library
- ——, “Metallaxis-fl: mutation-based fault localization,” Software Testing, Verification and Reliability, vol. 25, no. 5-7, pp. 605–628, 2015. Google ScholarDigital Library
- L. Zhang, L. Zhang, and S. Khurshid, “Injecting mechanical faults to localize developer faults for evolving software,” in OOPSLA, 2013, pp. 765–784. Google ScholarDigital Library
- T. A. Budd, “Mutation analysis of program test data,” Ph.D. dissertation, New Haven, CT, USA, 1980, aAI8025191. Google ScholarDigital Library
- X. Li and L. Zhang, “Transforming programs and tests in tandem for fault localization,” Proceedings of the ACM on Programming Languages, vol. 1, no. OOPSLA, p. 92, 2017. Google ScholarDigital Library
- T.-Y. Liu et al., “Learning to rank for information retrieval,” Foundations and Trends® in Information Retrieval, vol. 3, no. 3, pp. 225–331, 2009. Google ScholarDigital Library
- J. Sohn and S. Yoo, “Fluccs: using code and change metrics to improve fault localization,” in Proceedings of the 26th ACM SIGSOFT International Symposium on Software Testing and Analysis. ACM, 2017, pp. 273–283. Google ScholarDigital Library
- S. Wang, T. Liu, and L. Tan, “Automatically learning semantic features for defect prediction,” in Proceedings of the 38th International Conference on Software Engineering. ACM, 2016, pp. 297–308. Google ScholarDigital Library
- T. Dao, L. Zhang, and N. Meng, “How does execution information help with information-retrieval based bug localization?” in Proceedings of the 25th International Conference on Program Comprehension, 2017, pp. 241–250. Google ScholarDigital Library
- “Tensorflow website,” 2018. {Online}. Available: https://www.tensorflow.org/Google Scholar
- R. Just, D. Jalali, and M. D. Ernst, “Defects4J: A database of existing faults to enable controlled testing studies for Java programs,” in Proceedings of the International Symposium on Software Testing and Analysis (ISSTA), San Jose, CA, USA, July 23– 25 2014, pp. 437–440. Google ScholarDigital Library
- W. E. Wong, Y. Qi, L. Zhao, and K.-Y. Cai, “Effective fault localization using code coverage,” in Computer Software and Applications Conference, 2007. COMPSAC 2007. 31st Annual International, vol. 1. IEEE, 2007, pp. 449–456. Google ScholarDigital Library
- B. Liblit, M. Naik, A. X. Zheng, A. Aiken, and M. I. Jordan, “Scalable statistical bug isolation,” in ACM SIGPLAN Notices, vol. 40, no. 6, 2005, pp. 15–26. Google ScholarDigital Library
- L. Lucia, D. Lo, L. Jiang, F. Thung, and A. Budi, “Extended comprehensive study of association measures for fault localization,” Journal of Software: Evolution and Process, vol. 26, no. 2, pp. 172–219, 2014. Google ScholarDigital Library
- F. Keller, L. Grunske, S. Heiden, A. Filieri, A. van Hoorn, and D. Lo, “A critical evaluation of spectrum-based fault localization techniques on a large-scale software system,” in Software Quality, Reliability and Security (QRS), 2017 IEEE International Conference on. IEEE, 2017, pp. 114–125.Google ScholarCross Ref
- J. A. Jones, M. J. Harrold, and J. T. Stasko, “Visualization for fault localization,” in in Proceedings of ICSE 2001 Workshop on Software Visualization, 2001.Google Scholar
- Y. Ke, K. T. Stolee, C. Le Goues, and Y. Brun, “Repairing programs with semantic code search (t),” in Automated Software Engineering (ASE), 2015 30th IEEE/ACM International Conference on. IEEE, 2015, pp. 295–306. Google ScholarDigital Library
- S. Mechtaev, M.-D. Nguyen, Y. Noller, L. Grunske, and A. Roychoudhury, “Semantic program repair using a reference implementation,” 2018.Google ScholarDigital Library
- S. Kim, C. Le Goues, M. Pradel, and A. Roychoudhury, “Automated program repair (dagstuhl seminar 17022),” in Dagstuhl Reports, vol. 7, no. 1. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2017.Google Scholar
- D. Gopinath, S. Khurshid, D. Saha, and S. Chandra, “Data-guided repair of selection statements,” in Proceedings of the 36th International Conference on Software Engineering. ACM, 2014, pp. 243–253. Google ScholarDigital Library
- L. Zhang, T. Xie, L. Zhang, N. Tillmann, J. De Halleux, and H. Mei, “Test generation via dynamic symbolic execution for mutation testing,” in ICSM, 2010, pp. 1–10. Google ScholarDigital Library
- V. Musco, M. Monperrus, and P. Preux, “A large-scale study of call graph-based impact prediction using mutation testing,” Software Quality Journal, pp. 1–30, 2016. Google ScholarDigital Library
- D. Zou, J. Liang, Y. Xiong, M. D. Ernst, and L. Zhang, “An empirical study of fault localization families and their combinations,” IEEE Transactions on Software Engineering, 2019.Google ScholarDigital Library
- W. Zheng, D. Hu, and J. Wang, “Fault localization analysis based on deep neural network,” Mathematical Problems in Engineering, vol. 2016, 2016.Google ScholarCross Ref
- L. C. Briand, Y. Labiche, and X. Liu, “Using machine learning to support debugging with tarantula,” in Software Reliability, 2007. ISSRE’07. The 18th IEEE International Symposium on. IEEE, 2007, pp. 137–146. Google ScholarDigital Library
- Z. Zhang, Y. Lei, Q. Tan, X. Mao, P. Zeng, and X. Chang, “Deep learning-based fault localization with contextual information,” IEICE Transactions on Information and Systems, vol. 100, no. 12, pp. 3027–3031, 2017.Google ScholarCross Ref
- W. E. Wong and Y. Qi, “Bp neural network-based effective fault localization,” International Journal of Software Engineering and Knowledge Engineering, vol. 19, no. 04, pp. 573–597, 2009.Google ScholarCross Ref
- J. Guo, J. Cheng, and J. Cleland-Huang, “Semantically enhanced software traceability using deep learning techniques,” in Proceedings of the 39th International Conference on Software Engineering. IEEE Press, 2017, pp. 3–14. Google ScholarDigital Library
- G. E. Hinton and R. R. Salakhutdinov, “Reducing the dimensionality of data with neural networks,” science, vol. 313, no. 5786, pp. 504–507, 2006.Google Scholar
- D. E. Rumelhart, G. E. Hinton, R. J. Williams et al., “Learning representations by back-propagating errors,” Cognitive modeling, vol. 5, no. 3, p. 1, 1988.Google Scholar
- D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning internal representations by error propagation,” California Univ San Diego La Jolla Inst for Cognitive Science, Tech. Rep., 1985.Google Scholar
- T. Mikolov, M. Karafiát, L. Burget, J. Cernock`y, and S. Khudanpur, “Recurrent neural network based language model.” in Interspeech, vol. 2, 2010, p. 3. ISSTA ’19, July 15–19, 2019, Beijing, China Xia Li, Wei Li, Yuqun Zhang, and Lingming ZhangGoogle Scholar
- M.-C. Popescu, V. E. Balas, L. Perescu-Popescu, and N. Mastorakis, “Multilayer perceptron and neural networks,” WSEAS Transactions on Circuits and Systems, vol. 8, no. 7, pp. 579–588, 2009. Google ScholarDigital Library
- Y. Bengio, P. Simard, and P. Frasconi, “Learning long-term dependencies with gradient descent is difficult,” IEEE transactions on neural networks, vol. 5, no. 2, pp. 157–166, 1994. Google ScholarDigital Library
- S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural computation, vol. 9, no. 8, pp. 1735–1780, 1997. Google ScholarDigital Library
- A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” in Advances in neural information processing systems, 2012, pp. 1097–1105. Google ScholarDigital Library
- X. Xie, T. Y. Chen, F.-C. Kuo, and B. Xu, “A theoretical analysis of the risk evaluation formulas for spectrum-based fault localization,” ACM Transactions on Software Engineering and Methodology (TOSEM), vol. 22, no. 4, p. 31, 2013. Google ScholarDigital Library
- X. Xie, F.-C. Kuo, T. Y. Chen, S. Yoo, and M. Harman, “Provably optimal and human-competitive results in sbse for spectrum based fault localisation,” in International Symposium on Search Based Software Engineering. Springer, 2013, pp. 224–238. Google ScholarDigital Library
- D. Gray, D. Bowes, N. Davey, Y. Sun, and B. Christianson, “Using the support vector machine as a classification method for software defect prediction with static code metrics.” in EANN, vol. 2009. Springer, 2009, pp. 223–234.Google ScholarCross Ref
- T. Menzies, Z. Milton, B. Turhan, B. Cukic, Y. Jiang, and A. Bener, “Defect prediction from static code features: current results, limitations, new approaches,” Automated Software Engineering, vol. 17, no. 4, pp. 375–407, 2010. Google ScholarDigital Library
- K. Gao, T. M. Khoshgoftaar, H. Wang, and N. Seliya, “Choosing software metrics for defect prediction: an investigation on feature selection techniques,” Software: Practice and Experience, vol. 41, no. 5, pp. 579–606, 2011. Google ScholarDigital Library
- M. H. Halstead, Elements of software science. Elsevier New York, 1977, vol. 7. Google ScholarDigital Library
- “Asm java bytecode manipulation and analysis framework,” 2018. {Online}. Available: http://asm.ow2.org/Google Scholar
- S. K. Lukins, N. A. Kraft, and L. H. Etzkorn, “Bug localization using latent dirichlet allocation,” Information and Software Technology, vol. 52, no. 9, pp. 972–990, 2010. Google ScholarDigital Library
- R. K. Saha, M. Lease, S. Khurshid, and D. E. Perry, “Improving bug localization using structured information retrieval,” in Automated Software Engineering (ASE), 2013 IEEE/ACM 28th International Conference on, 2013, pp. 345–355. Google ScholarDigital Library
- J. Zhou, H. Zhang, and D. Lo, “Where should the bugs be fixed? more accurate information retrieval-based bug localization based on bug reports,” in Software Engineering (ICSE), 2012 34th International Conference on, 2012, pp. 14–24. Google ScholarDigital Library
- Q. Wang, C. Parnin, and A. Orso, “Evaluating the usefulness of ir-based fault localization techniques,” in Proceedings of the 2015 International Symposium on Software Testing and Analysis. ACM, 2015, pp. 1–11. Google ScholarDigital Library
- H. P. Luhn, “A statistical approach to mechanized encoding and searching of literary information,” IBM Journal of research and development, vol. 1, no. 4, pp. 309–317, 1957. Google ScholarDigital Library
- K. Sparck Jones, “A statistical interpretation of term specificity and its application in retrieval,” Journal of documentation, vol. 28, no. 1, pp. 11–21, 1972.Google ScholarCross Ref
- P.-T. De Boer, D. P. Kroese, S. Mannor, and R. Y. Rubinstein, “A tutorial on the cross-entropy method,” Annals of operations research, vol. 134, no. 1, pp. 19–67, 2005.Google ScholarCross Ref
- W. Chen, T. yan Liu, Y. Lan, Z. ming Ma, and H. Li, “Ranking measures and loss functions in learning to rank,” in Advances in Neural Information Processing Systems 22, Y. Bengio, D. Schuurmans, J. D. Lafferty, C. K. I. Williams, and A. Culotta, Eds. Curran Associates, Inc., 2009, pp. 315–323. {Online}. Available: http://papers.nips.cc/paper/3708-ranking-measures-andloss-functions-in-learning-to-rank.pdf Google ScholarDigital Library
- L. Bottou, “Large-scale machine learning with stochastic gradient descent,” in Proceedings of COMPSTAT’2010. Springer, 2010, pp. 177–186.Google ScholarCross Ref
- Y. Nesterov et al., “Gradient methods for minimizing composite objective function.” Core Louvain-la-Neuve, 2007.Google Scholar
- D. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.Google Scholar
- M. Zhang, X. Li, L. Zhang, and S. Khurshid, “Boosting spectrum-based fault localization using pagerank,” in Proceedings of the ACM SIGSOFT International Symposium on Software Testing and Analysis, 2017, pp. 261–272. Google ScholarDigital Library
- “Java programming language agents,” 2018. {Online}. Available: https://docs. oracle.com/javase/7/docs/api/java/lang/instrument/package-summary.htmlGoogle Scholar
- “Pit mutation testing system,” 2018. {Online}. Available: http://pitest.org/Google Scholar
- “Jhawk website,” 2018. {Online}. Available: http://www.virtualmachinery.com/ jhawkprod.htmGoogle Scholar
- “Indri website,” 2018. {Online}. Available: https://www.lemurproject.org/indri. phpGoogle Scholar
- “Libsvm website,” 2018. {Online}. Available: https://www.csie.ntu.edu.tw/~cjlin/ libsvm/Google Scholar
- C. Parnin and A. Orso, “Are automated debugging techniques actually helping programmers?” in Proceedings of the 2011 International Symposium on Software Testing and Analysis, 2011, pp. 199–209. Google ScholarDigital Library
- L. Chen, Y. Pei, and C. A. Furia, “Contract-based program repair without the contracts,” in Proceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering, 2017, pp. 637–647. Google ScholarDigital Library
- X. B. D. Le, D. Lo, and C. Le Goues, “History driven program repair,” in Software Analysis, Evolution, and Reengineering (SANER), 2016 IEEE 23rd International Conference on, vol. 1, 2016, pp. 213–224.Google Scholar
- T.-D. B. Le, R. J. Oentaryo, and D. Lo, “Information retrieval and spectrum based bug localization: better together,” in Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering, 2015, pp. 579–590. Google ScholarDigital Library
- Y. LeCun, C. Cortes, and C. Burges, “Mnist handwritten digit database,” AT&T Labs {Online}. Available: http://yann. lecun. com/exdb/mnist, vol. 2, 2010.Google Scholar
- “Deepfl website,” 2019. {Online}. Available: https://github.com/DeepFL/ DeepFaultLocalization.gitGoogle Scholar
- F. Wilcoxon, “Individual comparisons by ranking methods,” Biometrics bulletin, vol. 1, no. 6, pp. 80–83, 1945.Google ScholarCross Ref
- O. J. Dunn, “Multiple comparisons among means,” Journal of the American statistical association, vol. 56, no. 293, pp. 52–64, 1961.Google ScholarCross Ref
- “Effect size,” 2016. {Online}. Available: https://www.ncbi.nlm.nih.gov/pmc/ articles/PMC5122517/Google Scholar
- Q. Luo, F. Hariri, L. Eloussi, and D. Marinov, “An empirical analysis of flaky tests,” in Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering. ACM, 2014, pp. 643–653. Google ScholarDigital Library
Index Terms
- DeepFL: integrating multiple fault diagnosis dimensions for deep fault localization
Recommendations
Transforming programs and tests in tandem for fault localization
Localizing failure-inducing code is essential for software debugging. Manual fault localization can be quite tedious, error-prone, and time-consuming. Therefore, a huge body of research e orts have been dedicated to automated fault localization. ...
Fault density, fault types, and spectra-based fault localization
This paper presents multiple empirical experiments that investigate the impact of fault quantity and fault type on statistical, coverage-based fault localization techniques and fault-localization interference. Fault-localization interference is a ...
Factorising the Multiple Fault Localization Problem: Adapting Single-Fault Localizer to Multi-fault Programs
APSEC '12: Proceedings of the 2012 19th Asia-Pacific Software Engineering Conference - Volume 01Software failures are not rare and fault localizations always an important but laborious activity. Since there is no guarantee that no more than one fault exists in a faulty program, the approach to locate all the faults is necessary. Spectrum-based ...
Comments