Article

Deriving models of software fault-proneness

Authors:
Giovanni Denaro

Politecnico di Milano, Piazza Leonardo da Vinci, 32, 20133 Milano (Italy)

Politecnico di Milano, Piazza Leonardo da Vinci, 32, 20133 Milano (Italy)
View Profile

,
Sandro Morasca

Università degli Studi dell'Insubria, Via Valleggio, 11, I-22100 Como (Italy)

Università degli Studi dell'Insubria, Via Valleggio, 11, I-22100 Como (Italy)
View Profile

,
Mauro Pezzè

Università degli Studi di Milano Bicocca, Via Bicocca degli Arciboldi, 8 I-20126 Milano (Italy)

Università degli Studi di Milano Bicocca, Via Bicocca degli Arciboldi, 8 I-20126 Milano (Italy)
View Profile

SEKE '02: Proceedings of the 14th international conference on Software engineering and knowledge engineeringJuly 2002Pages 361–368https://doi.org/10.1145/568760.568824

Published:15 July 2002Publication History

SEKE '02: Proceedings of the 14th international conference on Software engineering and knowledge engineering

Pages 361–368

ABSTRACT

The effectiveness of the software testing process is a key issue for meeting the increasing demand of quality without augmenting the overall costs of software development. The estimation of software fault-proneness is important for assessing costs and quality and thus better planning and tuning the testing process. Unfortunately, no general techniques are available for estimating software fault-proneness and the distribution of faults to identify the correct level of test for the required quality. Although software complexity and testing thoroughness are intuitively related to the costs of quality assurance and the quality of the final product, single software metrics and coverage criteria provide limited help in planning the testing process and assuring the required quality.By using logistic regression, this paper shows how models can be built that relate software measures and software fault-proneness for classes of homogeneous software products. It also proposes the use of cross-validation for selecting valid models even for small data sets.The early results show that it is possible to build statistical models based on historical data for estimating fault-proneness of software modules before testing, and thus better planning and monitoring the testing activities.

References

V. Basili and D. Hutchens. An empirical study of a syntactic complexity family. IEEE Transactions on Software Engineering, 9(6):664-672, November 1983. Special Section on Software Metrics.Google ScholarDigital Library
L. Briand, V. Basili, and W. Thomas. A pattern recognition approach for software engineering data analysis. IEEE Transaction on Software Engineering, 18(11):931-942, November 1992. Google ScholarDigital Library
L. Briand, S. Morasca, and V. Basili. Defining and validating measures for object-based high-level design. IEEE Transactions on Software Engineering, 25(5):722-743, September/October 1999. Google ScholarDigital Library
N. Fenton and M. Neil. A critique of software defect prediction models. IEEE Transactions on Software Engineering, 25(5):675-689, September/October 1999. Google ScholarDigital Library
P. Frankl and O. Iakounenko. Further empirical studies of test effectiveness. ACM SIGSOFT Software Engineering Notes, 23(6):153-162, November 1998. Proceedings of the ACM SIGSOFT Sixth Internatioal Symposium on the Foundations of Software Engineering. Google ScholarDigital Library
P. Frankl and E. Weyuker. Provable improvements on branch testing. IEEE Transactions on Software Engineering, 19(10):962-975, October 1993. Google ScholarDigital Library
G. Gill and C. Kemerer. Cyclomatic complexity density and software maintenance productivity. IEEE Transactions on Software Engineering, 17(12):1284-1288, December 1991. Google ScholarDigital Library
M. Halstead. Elements of Software Science. Elsevier North-Holland, New York, 1 edition, 1977. Google ScholarDigital Library
D. Hosmer and S. Lemeshow. Applied Logistic Regression. Wiley-Interscience, 1989.Google Scholar
M. Hutchins, H. Foster, T. Goradia, and T. Ostrand. Experiments on the effectiveness of dataflow- and controlflow-based test adequacy criteria. In Bruno Fadini, editor, Proceedings of the 16th International Conference on Software Engineering, pages 191-200, Sorrento, Italy, May 1994. IEEE Computer Society Press. Google ScholarDigital Library
T. Khoshgoftaar, E. Allen, R. Halstead, G. Trio, and R. Flass. Using process history to predict software quality. Computer, 31(4):66-72, April 1998. Google ScholarDigital Library
T. Khoshgoftaar, E. Allen, K. Kalaichelvan, and N. Goel. Early quality prediction: a case study in telecommunications. IEEE Software, 13(1):65-71, January 1996. Google ScholarDigital Library
T. Khoshgoftaar, D. Lanning, and A. Pandya. A comparative-study of pattern-recognition techniques for quality evaluation of telecommunications software. IEEE Journal On Selected Areas In Communications, 12(2):279-291, 1994.Google ScholarDigital Library
J. M. Kim, A. Porter, and G. Rothermel. An empirical study of regression test application frequency. In Proceedings of the 22th International Conference on Software Engineering, pages 126-135, Limerick, Ireland, June 2000. Google ScholarDigital Library
M. Lehman, D. Perry, and J. Ramil. Implications of evolution metrics on software maintenance. In T. Koshgoftaar and K. Bennett, editors, Proceedings; International Conference on Software Maintenance, pages 208-217. IEEE Computer Society Press, 1998. Google ScholarDigital Library
H. F. Li and W. K. Cheung. An empirical study of software metrics. IEEE Transactions on Software Engineering, SE-13(6):697-708, June 1987. Google ScholarDigital Library
T. McCabe. A complexity measure. IEEE Transactions on Software Engineering, 2(4):308-320, December 1976.Google ScholarDigital Library
P. McCullagh and J. A. Nelder. Generalized Linear Models. Chapman and Hall, London, second edition, 1989.Google Scholar
K. Miller, L. Morell, R. Noonan, S. Park, D. Nicol, B. Murrill, and J. Voas. Estimating the probability of failure when testing reveals no failures. IEEE Transactions on Software Engineering, 18(1):33-43, January 1992. Google ScholarDigital Library
S. Morasca and G. Ruhe. A hybrid approach to analyze empirical software engineering data and its application to predict module fault-proneness in maintenance. The Journal of Systems and Software, 53(3):225-237, September 2000. Google ScholarDigital Library
J. Munson and T. Khoshgoftaar. The detection of fault-prone programs. IEEE Transactions on Software Engineering, 18(5):423-33, 1992. Google ScholarDigital Library
N. Ohlsson and H. Alberg. Predicting fault-prone software modules in telephone switches. IEEE Transactions on Software Engineering, 22(12):886-894, December 1996. Google ScholarDigital Library
A. Pasquini, A. Crespo, and P. Matrella. Sensitivity of reliability-growth models to operational profile errors vs. testing accuracy {software testing}. IEEE Transaction on Reliability, 45(4):531-540, December 1996.Google ScholarCross Ref
A. Porter and R. Selby. Empirically guided software development using metric-based classification trees. IEEE Software, 7(2):46-54, March 1990. Google ScholarDigital Library
J. Quinlan. Induction of decision trees. Machine Learning, 1(1):81-106, 1986. QUINLAN86. Google ScholarCross Ref
R. Selby and V. Basili. Analysing error-prone system structure. IEEE Transactions on Software Engineering, 17(2):141-152, 1991. Google ScholarDigital Library
R. Selby and A. Porter. Learning from examples: Generation and evaluation of decision trees for software resource analysis. IEEE Transactions on Software Engineering, 14(12):1743-1757, December 1988. Special Issue on Artificial Intelligence in Software Applications. Google ScholarDigital Library
J. Voas and K. Miller. Semantic metrics for software testability. Journal Of Systems and Software, 20(3):207-216, 1993. Google ScholarDigital Library
I. Witten and E. Frank. Data Mining: Practical Machine Learning Tools and Tecniques with Java Implementations. Morgan Kaufmann Publishers, 2000. Google ScholarDigital Library
M. Woodward, M. Hennell, and D. Hedley. A measure of control flow complexity in program text. IEEE Transactions on Software Engineering, 5(1):45-50, January 1979.Google ScholarDigital Library

Recommendations

An empirical evaluation of fault-proneness models
ICSE '02: Proceedings of the 24th International Conference on Software Engineering

Planning and allocating resources for testing is difficult and it is usually done on empirical basis, often leading to unsatisfactory results. The possibility of early estimating the potential faultiness of software could be of great help for planning ...
Read More
Applying machine learning to software fault-proneness prediction

The importance of software testing to quality assurance cannot be overemphasized. The estimation of a module's fault-proneness is important for minimizing cost and improving the effectiveness of the software testing process. Unfortunately, no general ...
Read More
The application of ROC analysis in threshold identification, data imbalance and metrics selection for software fault prediction

Software engineers have limited resources and need metrics analysis tools to investigate software quality such as fault-proneness of modules. There are a large number of software metrics available to investigate quality. However, not all metrics are ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SEKE '02: Proceedings of the 14th international conference on Software engineering and knowledge engineering
July 2002
859 pages
ISBN:1581135564
DOI:10.1145/568760
Conference Chairs:
Genny Tortora,
Shi-Kuo Chang
Copyright © 2002 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 15 July 2002
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
cross-validation
fault-proneness models
logistic regression
software faultiness
software metrics
software testing process
Qualifiers
- Article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 28
  Total Citations
  View Citations
- 1,224
  Total Downloads
- Downloads (Last 12 months)2
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Deriving models of software fault-proneness

SEKE '02: Proceedings of the 14th international conference on Software engineering and knowledge engineering

ABSTRACT

References

Cited By

Recommendations

An empirical evaluation of fault-proneness models

Applying machine learning to software fault-proneness prediction

The application of ROC analysis in threshold identification, data imbalance and metrics selection for software fault prediction

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Deriving models of software fault-proneness

SEKE '02: Proceedings of the 14th international conference on Software engineering and knowledge engineering

ABSTRACT

References

Cited By

Recommendations

An empirical evaluation of fault-proneness models

Applying machine learning to software fault-proneness prediction

The application of ROC analysis in threshold identification, data imbalance and metrics selection for software fault prediction

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media