ABSTRACT
Bug localization, i.e., locating bugs in code snippets, is a frequent task in software development. Although static bug-finding tools are available to reduce manual effort in bug localization, these tools typically detect bugs with known project-independent bug patterns. However, many bugs in real-world code snippets are project-specific. To address this issue, in this paper, we propose a novel approach for LOcating Bugs By Searching the most similar sample snippet (LOBBYS). LOBBYS detects bugs with the help of crowd-contributed correct code, which implement the function that buggy code is expected to implement. Given a buggy code snippet, LOBBYS takes two steps to locate the bug: (1) normalize the bug-gy snippet, and then search for the most similar sample snippet from the code base; (2) align the buggy code and sample code snip-pets, find the difference between the two code snippets, and generate a bug report based on the difference. To evaluate LOBBYS, we build one algorithm-oriented code base and select some buggy snippets from two real-world systems. The result shows that LOBBYS can effectively locate bugs for buggy snippets with high precision. Under the similarity of 50%, 70% and 90%, LOBBYS achieves bug-localization precision as 67%, 83%, and 92%.
- J. Li and M. D. Ernst. CBCD: Cloned Buggy Code Detector. In Proceedings of the 34th International Conference on Software Engineering, ICSE 2012. Google ScholarDigital Library
- N. Ayewah, W. Pugh, J. D. Morgenthaler, J. Penix, Y. Zhou. Using Findbugs on Production Software. In Proceedings of OOPSLA 2005. Google ScholarDigital Library
- L. Jiang, Z. Su, and E. Chiu. Context-Based Detection of Clone-Related Bugs. In Proceedings of the 15th IEEE International Symposium on the Foundations of Software Engineering, FSE 2007. Google ScholarDigital Library
- Z. Li, S. Lu, S. Myagmar and Y.Zhou. CP-Miner: A Tool for Finding Copy-Paste and Related Bugs in Operating System Code. In Symposium on Operating Systems Design and Implementation, OSDI 2004. Google ScholarDigital Library
- G. Liang, Q. Wang, T. Xie, H. Mei. Inferring Project-Specific Bug Patterns for Detecting Sibling Bugs. In Proceedings of the International Symposium on the Foundations of Software Engineering, FSE 2013. Google ScholarDigital Library
- M. Renieris and S. Reiss. Fault Localization with NearestNeighbor Queries. In Proceedings of the 18th IEEE International Conference on Automated Software Engineering, ASE 2003.Google Scholar
- X. Zhang, N. Gupta, and R. Gupta. Locating Faults through Automated Predicate Switching. In Proceedings of the 28th International Conference on Software Engineering, ICSE 2006. Google ScholarDigital Library
- C.Liu, X. Yan, L. Fei, J. Han, and S. Midkiff. SOBER: Statistical Model-based Bug Localization. In Proceedings of the 13th IEEE International Symposium on the Foundations of Software Engineering, FSE 2005. Google ScholarDigital Library
- J. Jones and M. Harrold. Empirical evaluation of the tarantula automatic fault-localization technique, in: Proceedings of the 20th IEEE International Conference on Automated Software Engineering, ASE 2005. Google ScholarDigital Library
- P. Arumuga Nainar, T. Chen, J. Rosin, and B. Liblit. Statistical Debugging Using Compound Boolean Predicates. In International Symposium on Software Testing and Analysis, ISSTA, 2007. Google ScholarDigital Library
- T. M. Chilimbi, B. Liblit, K. Mehra, A. V. Nori, and K. Vaswani. HOLMES: Effective Statistical Debugging via Efficient Path Profiling. In Proceedings of the 31th International Conference on Software Engineering, ICSE 2009. Google ScholarDigital Library
- G. K. Baah, A. Podgurski, and M. J. Harrold. Causal Inference for Statistical Fault Localization. In International Symposium on Software Testing and Analysis, ISSTA 2010. Google ScholarDigital Library
- R. Gore and P. F. Reynolds, Jr. Reducing Confounding Bias in Predicate-level Statistical Debugging Metrics. In Proceedings of the 34th International Conference on Software Engineering, ICSE 2012. Google ScholarDigital Library
- C. K. Roy, J. R. Cordy, and R. Koschke. Comparison and evaluation of code clone detection techniques and tools: A qualitative approach. Science of Computer Programming, 74(7):470–495, 2009. Google ScholarDigital Library
- J. Johnson. Identifying redundancy in source code using fingerprints. In Proceedings of the 1993 Conference of the Centre for Advanced Studies on Collaborative Research, CASCON 1993. Google ScholarDigital Library
- J. Johnson. Visualizing textual redundancy in legacy source. In Proceedings of the 1994 Conference of the Centre for Advanced Studies on Collaborative research, CASCON 2004. Google ScholarDigital Library
- U. Manber. Finding similar files in a large file system. In Proceedings of the Winter 1994 Usenix Technical Conference, 1994. Google ScholarDigital Library
- S. Ducasse, M. Rieger, and S. Demeyer. A language independent approach for detecting duplicated code. In Proceedings of the 15th International Conference on Software Maintenance, ICSM 1999. Google ScholarDigital Library
- M. Rieger. Effective clone detection without language barriers. Ph.D. Thesis, University of Bern, Switzerland, 2005.Google Scholar
- R. Wettel and R. Marinescu. Archeology of code duplication: Recovering duplication chains from small duplication fraents. In Proceedings of SYNASC 2005. Google ScholarDigital Library
- S. Lee and I. Jeong. SDD: High performance code clone detection system for large scale source code. In Proceedings of OOPSLA 2005. Google ScholarDigital Library
- A. Marcus and J. Maletic. Identification of high-level concept clones in source code. In Proceedings of the 16th IEEE International Conference on Automated Software Engineering, ASE 2001. Google ScholarDigital Library
- B. Baker. On finding duplication and near-duplication in large software systems. In Proceedings of the 2nd Working Conference on Reverse Engineering, WCRE1995. Google ScholarDigital Library
- B. Baker. A program for identifying duplicated code. In Proceedings of Computing Science and Statistics: 24th Symposium on the Interface, vol.24, 1992.Google Scholar
- T. Kamiya, S. Kusumoto, and K. Inoue. CCFinder: A multi linguistic token-based code clone detection system for large scale source code. IEEE Transactions on Software Engineering 28(7) (2002). Google ScholarDigital Library
- I. Baxter, A. Yahin, L. Moura, and M. Anna. Clone detection using abstract syntax trees. In Proceedings of the 14th International Conference on Software Maintenance, ICSM 1998. Google ScholarDigital Library
- W. Yang. Identifying syntactic differences between two programs, Software Practice and Experience 21(7) (1991)739755. Google ScholarDigital Library
- V. Wahler, D. Seipel, J. Gudenberg, and G. Fischer. Clone detection in source code by frequent item set techniques. In Proceedings of the 4th IEEE International Workshop Source Code Analysis and Manipulation, SCAM 2004. Google ScholarDigital Library
- J. Mayrand, C. Leblanc, and E. Merlo. Experiment on the automatic detection of function clones in a software system using metrics. In Proceedings of the 12th International Conference on Software Maintenance, ICSM 1996. Google ScholarDigital Library
- C. Liu, C. Chen, J. Han, and P. S. Yu. GPLAG: detection of software plagiarism by program dependence graph analysis. In KDD ’06: Proceedings of the ACM SIGKDD international conference on Knowledge discovery and data mining, 2006. Google ScholarDigital Library
- Stack Overflow. http://stackoverflow.com/.Google Scholar
- Findbugs. http://findbugs.sourceforge.net/.Google Scholar
- Coverity. http://www.coverity.com/.Google Scholar
- Jlint. http://jlint.sourceforge.net/.Google Scholar
- POJ. http://poj.org/.Google Scholar
- Pex4Fun. http://pex4fun.com/.Google Scholar
Index Terms
Bug localization via searching crowd-contributed code
Recommendations
Potential biases in bug localization: do they matter?
ASE '14: Proceedings of the 29th ACM/IEEE International Conference on Automated Software EngineeringIssue tracking systems are valuable resources during software maintenance activities and contain information about the issues faced during the development of a project as well as after its release. Many projects receive many reports of bugs and it is ...
Bug localization with combination of deep learning and information retrieval
ICPC '17: Proceedings of the 25th International Conference on Program ComprehensionThe automated task of locating the potential buggy files in a software project given a bug report is called bug localization. Bug localization helps developers focus on crucial files. However, the existing automated bug localization approaches face a ...
A preliminary study on using code smells to improve bug localization
ICPC '18: Proceedings of the 26th Conference on Program ComprehensionBug localization is a technique that has been proposed to support the process of identifying the locations of bugs specified in a bug report. A traditional approach such as information retrieval (IR)-based bug localization calculates the similarity ...
Comments