skip to main content
10.1145/1134285.1134336acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
Article

Who should fix this bug?

Published:28 May 2006Publication History

ABSTRACT

Open source development projects typically support an open bug repository to which both developers and users can report bugs. The reports that appear in this repository must be triaged to determine if the report is one which requires attention and if it is, which developer will be assigned the responsibility of resolving the report. Large open source developments are burdened by the rate at which new bug reports appear in the bug repository. In this paper, we present a semi-automated approach intended to ease one part of this process, the assignment of reports to a developer. Our approach applies a machine learning algorithm to the open bug repository to learn the kinds of reports each developer resolves. When a new report arrives, the classifier produced by the machine learning technique suggests a small number of developers suitable to resolve the report. With this approach, we have reached precision levels of 57% and 64% on the Eclipse and Firefox development projects respectively. We have also applied our approach to the gcc open source development with less positive results. We describe the conditions under which the approach is applicable and also report on the lessons we learned about applying machine learning to repositories used in open source development.

References

  1. R. A. Baeza-Yates and B. A. Ribeiro-Neto. Modern Information Retrieval. 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. I. T. Bowman and R. C. Holt. Reconstructing ownership architectures to help understand software systems. In Proceedings of International Workshop on Program Comprehension, pages 28--37, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. G. Canfora and L. Cerulo. How software repositories can help in resolving a new change request. In Workshop on Empirical Studies in Reverse Engineering, 2005.Google ScholarGoogle Scholar
  4. D. Čubranić and G. C. Murphy. Automatic bug triage using text classification. In Proceedings of Software Engineering and Knowledge Engineering, pages 92--97, 2004.Google ScholarGoogle Scholar
  5. D. Čubranić, J. Singer, and K. S. Booth. Hipikat: A project memory for software development. IEEE Transactions on Software Engineering, 31(6):446--465, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. S. R. Gunn. Support Vector Machines for classification and regression. Technical report, University of Southampton, Faculty of Engineering, Science and Mathematics; School of Electronics and Computer Science, 1998.Google ScholarGoogle Scholar
  7. T. Joachims. Text categorization with support vector machines: Learning with many relevant features. In Proceedings of the 10th European Conference on Machine Learning, pages 137--142, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. G. H. John and P. Langley. Estimating continous distributions in Bayesian classifiers. In Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence, pages 338--345, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. A. Mockus and J. D. Herbsleb. Expertise browser: A quantitative approach to identifying expertise. In Proceedings of the 24th International Conference on Software Engineering, pages 503--512, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. A. Podgurski, D. Leon, P. Francis, Wes Masri, M. Minch, Jiayang Sun, and B. Wang. Automated support for classifying software failure reports. In Proceedings of the 25th International Conference on Software Engineering, pages 465--475, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. R. Quinlan. C4.5: Programs for Machine Learning. 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. E. S. Raymond. The cathedral and the bazaar. First Monday, 3(3), 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. C. R. Reis and R. P. de Mattos Fortes. An overview of the software engineering process and tools in the Mozilla project. In Proceedings of the Open Source Software Development Workshop, pages 155--175, 2002.Google ScholarGoogle Scholar
  14. J. D. M. Rennie, L. Shih, J. Teevan, and D. R. Karger. Tackling the poor assumptions of Naive Bayes classifiers. In Proceedings of International Conference on Machine Learning, pages 616--623, 2003.Google ScholarGoogle Scholar
  15. F. Sebastiani. Machine learning in automated text categorization. ACM Computing Surveys, 34(1):1--47, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. R. Segal and J. Kephart. Incremental learning in SwiftFile. In Proceedings of the Seventh International Conference on Machine Learning, pages 863--870, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. I. H. Witten and E. Frank. Data Mining: Practical Machine Learning Tools with Java Implementations. 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Who should fix this bug?

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      ICSE '06: Proceedings of the 28th international conference on Software engineering
      May 2006
      1110 pages
      ISBN:1595933751
      DOI:10.1145/1134285

      Copyright © 2006 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 28 May 2006

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • Article

      Acceptance Rates

      Overall Acceptance Rate276of1,856submissions,15%

      Upcoming Conference

      ICSE 2024

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader