skip to main content
10.1145/2677832.2677833acmotherconferencesArticle/Chapter ViewAbstractPublication PagesinternetwareConference Proceedingsconference-collections
Article

Bug localization via searching crowd-contributed code

Published:17 November 2014Publication History

ABSTRACT

Bug localization, i.e., locating bugs in code snippets, is a frequent task in software development. Although static bug-finding tools are available to reduce manual effort in bug localization, these tools typically detect bugs with known project-independent bug patterns. However, many bugs in real-world code snippets are project-specific. To address this issue, in this paper, we propose a novel approach for LOcating Bugs By Searching the most similar sample snippet (LOBBYS). LOBBYS detects bugs with the help of crowd-contributed correct code, which implement the function that buggy code is expected to implement. Given a buggy code snippet, LOBBYS takes two steps to locate the bug: (1) normalize the bug-gy snippet, and then search for the most similar sample snippet from the code base; (2) align the buggy code and sample code snip-pets, find the difference between the two code snippets, and generate a bug report based on the difference. To evaluate LOBBYS, we build one algorithm-oriented code base and select some buggy snippets from two real-world systems. The result shows that LOBBYS can effectively locate bugs for buggy snippets with high precision. Under the similarity of 50%, 70% and 90%, LOBBYS achieves bug-localization precision as 67%, 83%, and 92%.

References

  1. J. Li and M. D. Ernst. CBCD: Cloned Buggy Code Detector. In Proceedings of the 34th International Conference on Software Engineering, ICSE 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. N. Ayewah, W. Pugh, J. D. Morgenthaler, J. Penix, Y. Zhou. Using Findbugs on Production Software. In Proceedings of OOPSLA 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. L. Jiang, Z. Su, and E. Chiu. Context-Based Detection of Clone-Related Bugs. In Proceedings of the 15th IEEE International Symposium on the Foundations of Software Engineering, FSE 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Z. Li, S. Lu, S. Myagmar and Y.Zhou. CP-Miner: A Tool for Finding Copy-Paste and Related Bugs in Operating System Code. In Symposium on Operating Systems Design and Implementation, OSDI 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. G. Liang, Q. Wang, T. Xie, H. Mei. Inferring Project-Specific Bug Patterns for Detecting Sibling Bugs. In Proceedings of the International Symposium on the Foundations of Software Engineering, FSE 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. M. Renieris and S. Reiss. Fault Localization with NearestNeighbor Queries. In Proceedings of the 18th IEEE International Conference on Automated Software Engineering, ASE 2003.Google ScholarGoogle Scholar
  7. X. Zhang, N. Gupta, and R. Gupta. Locating Faults through Automated Predicate Switching. In Proceedings of the 28th International Conference on Software Engineering, ICSE 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. C.Liu, X. Yan, L. Fei, J. Han, and S. Midkiff. SOBER: Statistical Model-based Bug Localization. In Proceedings of the 13th IEEE International Symposium on the Foundations of Software Engineering, FSE 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. J. Jones and M. Harrold. Empirical evaluation of the tarantula automatic fault-localization technique, in: Proceedings of the 20th IEEE International Conference on Automated Software Engineering, ASE 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. P. Arumuga Nainar, T. Chen, J. Rosin, and B. Liblit. Statistical Debugging Using Compound Boolean Predicates. In International Symposium on Software Testing and Analysis, ISSTA, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. T. M. Chilimbi, B. Liblit, K. Mehra, A. V. Nori, and K. Vaswani. HOLMES: Effective Statistical Debugging via Efficient Path Profiling. In Proceedings of the 31th International Conference on Software Engineering, ICSE 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. G. K. Baah, A. Podgurski, and M. J. Harrold. Causal Inference for Statistical Fault Localization. In International Symposium on Software Testing and Analysis, ISSTA 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. R. Gore and P. F. Reynolds, Jr. Reducing Confounding Bias in Predicate-level Statistical Debugging Metrics. In Proceedings of the 34th International Conference on Software Engineering, ICSE 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. C. K. Roy, J. R. Cordy, and R. Koschke. Comparison and evaluation of code clone detection techniques and tools: A qualitative approach. Science of Computer Programming, 74(7):470–495, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. J. Johnson. Identifying redundancy in source code using fingerprints. In Proceedings of the 1993 Conference of the Centre for Advanced Studies on Collaborative Research, CASCON 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. J. Johnson. Visualizing textual redundancy in legacy source. In Proceedings of the 1994 Conference of the Centre for Advanced Studies on Collaborative research, CASCON 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. U. Manber. Finding similar files in a large file system. In Proceedings of the Winter 1994 Usenix Technical Conference, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. S. Ducasse, M. Rieger, and S. Demeyer. A language independent approach for detecting duplicated code. In Proceedings of the 15th International Conference on Software Maintenance, ICSM 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. M. Rieger. Effective clone detection without language barriers. Ph.D. Thesis, University of Bern, Switzerland, 2005.Google ScholarGoogle Scholar
  20. R. Wettel and R. Marinescu. Archeology of code duplication: Recovering duplication chains from small duplication fraents. In Proceedings of SYNASC 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. S. Lee and I. Jeong. SDD: High performance code clone detection system for large scale source code. In Proceedings of OOPSLA 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. A. Marcus and J. Maletic. Identification of high-level concept clones in source code. In Proceedings of the 16th IEEE International Conference on Automated Software Engineering, ASE 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. B. Baker. On finding duplication and near-duplication in large software systems. In Proceedings of the 2nd Working Conference on Reverse Engineering, WCRE1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. B. Baker. A program for identifying duplicated code. In Proceedings of Computing Science and Statistics: 24th Symposium on the Interface, vol.24, 1992.Google ScholarGoogle Scholar
  25. T. Kamiya, S. Kusumoto, and K. Inoue. CCFinder: A multi linguistic token-based code clone detection system for large scale source code. IEEE Transactions on Software Engineering 28(7) (2002). Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. I. Baxter, A. Yahin, L. Moura, and M. Anna. Clone detection using abstract syntax trees. In Proceedings of the 14th International Conference on Software Maintenance, ICSM 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. W. Yang. Identifying syntactic differences between two programs, Software Practice and Experience 21(7) (1991)739755. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. V. Wahler, D. Seipel, J. Gudenberg, and G. Fischer. Clone detection in source code by frequent item set techniques. In Proceedings of the 4th IEEE International Workshop Source Code Analysis and Manipulation, SCAM 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. J. Mayrand, C. Leblanc, and E. Merlo. Experiment on the automatic detection of function clones in a software system using metrics. In Proceedings of the 12th International Conference on Software Maintenance, ICSM 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. C. Liu, C. Chen, J. Han, and P. S. Yu. GPLAG: detection of software plagiarism by program dependence graph analysis. In KDD ’06: Proceedings of the ACM SIGKDD international conference on Knowledge discovery and data mining, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Stack Overflow. http://stackoverflow.com/.Google ScholarGoogle Scholar
  32. Findbugs. http://findbugs.sourceforge.net/.Google ScholarGoogle Scholar
  33. Coverity. http://www.coverity.com/.Google ScholarGoogle Scholar
  34. Jlint. http://jlint.sourceforge.net/.Google ScholarGoogle Scholar
  35. POJ. http://poj.org/.Google ScholarGoogle Scholar
  36. Pex4Fun. http://pex4fun.com/.Google ScholarGoogle Scholar

Index Terms

  1. Bug localization via searching crowd-contributed code

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Other conferences
          Internetware '14: Proceedings of the 6th Asia-Pacific Symposium on Internetware
          November 2014
          152 pages
          ISBN:9781450333030
          DOI:10.1145/2677832
          • General Chairs:
          • Hong Mei,
          • Jian Lv,
          • Program Chairs:
          • Minghui Zhou,
          • Charles Zhang

          Copyright © 2014 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 17 November 2014

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • Article

          Acceptance Rates

          Overall Acceptance Rate55of111submissions,50%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader