ACM Home Page
Please provide us with feedback. Feedback
Predicting defect densities in source code files with decision tree learners
Full text PdfPdf (651 KB)
Source International Conference on Software Engineering archive
Proceedings of the 2006 international workshop on Mining software repositories table of contents
Shanghai, China
SESSION: Defects table of contents
Pages: 119 - 125  
Year of Publication: 2006
ISBN:1-59593-397-2
Authors
Patrick Knab  University of Zurich, Switzerland
Martin Pinzger  University of Zurich, Switzerland
Abraham Bernstein  University of Zurich, Switzerland
Sponsors
ACM: Association for Computing Machinery
SIGSOFT: ACM Special Interest Group on Software Engineering
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 15,   Downloads (12 Months): 84,   Citation Count: 3
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
Save this Article to a Binder    Display Formats: BibTex  EndNote ACM Ref   
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1137983.1138012
What is a DOI?

ABSTRACT

With the advent of open source software repositories the data available for defect prediction in source files increased tremendously. Although traditional statistics turned out to derive reasonable results the sheer amount of data and the problem context of defect prediction demand sophisticated analysis such as provided by current data mining and machine learning techniques.In this work we focus on defect density prediction and present an approach that applies a decision tree learner on evolution data extracted from the Mozilla open source web browser project. The evolution data includes different source code, modification, and defect measures computed from seven recent Mozilla releases. Among the modification measures we also take into account the change coupling, a measure for the number of change-dependencies between source files. The main reason for choosing decision tree learners, instead of for example neural nets, was the goal of finding underlying rules which can be easily interpreted by humans. To find these rules, we set up a number of experiments to test common hypotheses regarding defects in software entities. Our experiments showed, that a simple tree learner can produce good results with various sets of input data.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
 
3
 
4
 
5
 
6
 
7
 
8
 
9
 
10
 
11
12
 
13
 
14
I. H. Witten and E. Frank. Data Mining. Morgan Kaufmann Publishers, 1999.
 
15


Collaborative Colleagues:
Patrick Knab: colleagues
Martin Pinzger: colleagues
Abraham Bernstein: colleagues