research-article

Predicting defects using network analysis on dependency graphs

Authors:
Thomas Zimmermann

University of Calgary, Calgary, AB, Canada

University of Calgary, Calgary, AB, Canada
View Profile

,
Nachiappan Nagappan

Microsoft Research, Redmond, WA, USA

Microsoft Research, Redmond, WA, USA
View Profile

ICSE '08: Proceedings of the 30th international conference on Software engineeringMay 2008Pages 531–540https://doi.org/10.1145/1368088.1368161

Published:10 May 2008Publication History

ICSE '08: Proceedings of the 30th international conference on Software engineering

Pages 531–540

ABSTRACT

In software development, resources for quality assurance are limited by time and by cost. In order to allocate resources effectively, managers need to rely on their experience backed by code complexity metrics. But often dependencies exist between various pieces of code over which managers may have little knowledge. These dependencies can be construed as a low level graph of the entire system. In this paper, we propose to use network analysis on these dependency graphs. This allows managers to identify central program units that are more likely to face defects. In our evaluation on Windows Server 2003, we found that the recall for models built from network measures is by 10% points higher than for models built from complexity metrics. In addition, network measures could identify 60% of the binaries that the Windows developers considered as critical-twice as many as identified by complexity metrics.

References

F. B. e. Abreu and W. Melo, "Evaluating the Impact of OO Design on Software Quality," in International Software Metrics Symposium, Berlin, 1996. Google ScholarDigital Library
V. R. Basili, L. C. Briand, and W. L. Melo, "A Validation of Object Orient Design Metrics as Quality Indicators," IEEE Transactions on Software Engineering, vol. 22, pp. 751--761, 1996. Google ScholarDigital Library
V. R. Basili, F. Shull, and F. Lanubile, "Building Knowledge Through Families of Experiments," IEEE Transactions on Software Engineering, vol. 25, pp. 456 -- 473, 1999. Google ScholarDigital Library
J. Bevan, "Software Instability Analysis: Co-Change Analysis Across Configuration-Based Dependence Relationships." PhD Thesis: University of California, Santa Cruz, 2006. Google ScholarDigital Library
J. Bevan and J. E. James Whitehead, "Identification of Software Instabilities," in Working Conference on Reverse Engineering, 2003, pp. 134--145. Google ScholarDigital Library
A. B. Binkley and S. Schach, "Validation of the coupling dependency metric as a predictor of run-time failures and maintenance measures," in International Conference on Software Engineering, 1998, pp. 452 -- 455 Google ScholarDigital Library
D. Binkley and M. Harman, "An empirical study of predicate dependence levels and trends," in International Conference on Software Engineering 2003, pp. 330--339. Google ScholarDigital Library
S. P. Borgatti, M. G. Everett, and L. C. Freeman, "Ucinet for Windows: Software for Social Network Analysis," Analytic Technologies, Harvard, MA, 2002.Google Scholar
L. Briand, P. Devanbu, and W. Melo, "An investigation into coupling measures for C++," in International Conference on Software Engineering Boston, Massachusetts, United States: ACM Press, 1997. Google ScholarDigital Library
L. C. Briand, J. W. Daly, and J. Wüst, "A Unified Framework for Coupling Measurement in Object-Oriented Systems," IEEE Transactions on Software Engineering, vol. 25, pp. 91--121, 1999. Google ScholarDigital Library
L. C. Briand, W. L. Melo, and J. Wüst, "Assessing the Applicability of Fault-Proneness Models Across Object-Oriented Software Projects," IEEE Transactions on Software Engineering, vol. 28, pp. 706--720, 2002. Google ScholarDigital Library
C. Bron and J. Kerbosch, "Algorithm 457: finding all cliques of an undirected graph " Communications of the ACM, vol. 16, pp. 575--577, 1973. Google ScholarDigital Library
R. Burt, Structural Holes: The Social Structure of Competition. Cambridge, MA: Harvard University Press, 1995.Google Scholar
J. Cho, H. Garcia-Molina, and L. Page, "Efficient crawling through URL ordering," in International Conference on World Wide Web Brisbane, Australia: Elsevier Science Publishers B. V., 1998. Google ScholarDigital Library
N. Fenton and N. Ohlsson, "Quantitative analysis of faults and failures in a complex software system," IEEE Transactions on Software Engineering, vol. 26, pp. 797 -- 814 2000. Google ScholarDigital Library
N. E. Fenton and S. L. Pfleeger, Software Metrics: A Rigorous and Practical Approach: Brooks/Cole, 1998. Google ScholarDigital Library
J. Ferrante, K. J. Ottenstein, and J. D. Warren, "The program dependence graph and its use in optimization," ACM Transactions on Programming Languages and Systems, vol. 9, pp. 319 -- 349 1987. Google ScholarDigital Library
R. A. Ghosh, "Clustering and dependencies in free/open source software development: Methodology and tools," First Monday, vol. 8, 2003.Google Scholar
R. A. Hanneman and M. Riddle, Introduction to social network methods. Riverside, CA: University of California, Riverside 2005.Google Scholar
A. E. Hassan and R. C. Holt, "The Small World of Software Reverse Engineering," in Working Conference on Reverse Engineering: IEEE Computer Society, 2004. Google ScholarDigital Library
S. M. Henry and D. Kafura, "Software Structure Metrics based on Information Flow," IEEE Transactions on Software Engineering, vol. 7, pp. 510--518, 1981. Google ScholarDigital Library
S.-K. Huang and K.-m. Liu, "Mining version histories to verify the learning process of Legitimate Peripheral Participants," in International Workshop on Mining Software Repositories, 2005. Google ScholarDigital Library
E. J. Jackson, A Users Guide to Principal Components. Hoboken, NJ: John Wiley & Sons Inc., 2003.Google Scholar
A. J. Ko and B. A. Myers, "A framework and methodology for studying the causes of software errors in programming systems," Journal of Visual Languages & Computing, vol. 16, pp. 41--84, 2005. Google ScholarDigital Library
B. Korel, "The program dependence graph in static program testing," Information Processing Letters, vol. 24, pp. 103--108, 1987. Google ScholarDigital Library
Z. Li, L. Tan, X. Wang, S. Lu, Y. Zhou, and C. Zhai, "Have things changed now? An empirical study of bug characteristics in modern open source software," in Workshop on Architectural and System Support for Improving Software Dependability San Jose, California: ACM Press, 2006. Google ScholarDigital Library
L. Lopez-Fernandez, G. Robles, and J. M. Gonzalez-Barahona, "Applying Social Network Analysis to the Information in CVS Repositories," in International Workshop on Mining Software Repositories, Edinburgh, Scotland, UK, 2004, pp. 101--105.Google Scholar
G. Madey, V. Freeh, and R. Tynan, "The open source software development phenomenon: An analysis based on social network theory," Americas Conference on Information Systems, 2002.Google Scholar
R. Milo, S. Shen-Orr, S. Itzkovitz, N. Kashtan, D. Chklovskii, and U. Alon, "Network Motifs: Simple Building Blocks of Complex Networks," Science, vol. 298, pp. 824--827, October 25, 2002 2002.Google ScholarCross Ref
K.-H. Möller and D. J. Paulish, "An empirical investigation of software fault distribution," in International Software Metrics Symposium, 1993, pp. 82--90.Google Scholar
J. Munson and T. Khoshgoftaar, "The Detection of Fault-Prone Programs," IEEE Transactions on Software Engineering, vol. 18, pp. 423--433, 1992. Google ScholarDigital Library
N. Nagappan and T. Ball, "Use of Relative Code Churn Measures to Predict System Defect Density," in International Conference on Software Engineering, St. Louis, MO, 2005, pp. 284--292. Google ScholarDigital Library
N. Nagappan and T. Ball, "Using Software Dependencies and Churn Metrics to Predict Field Failures: An Empirical Case Study," in International Symposium on Empirical Software Engineering and Measurement, 2007, pp. 364--373. Google ScholarDigital Library
N. Nagappan, T. Ball, and B. Murphy, "Using Historical In-Process and Product Metrics for Early Estimation of Software Failures.," in International Symposium on Software Reliability Engineering, 2006, pp. 62--74. Google ScholarDigital Library
N. Nagappan, T. Ball, and A. Zeller, "Mining metrics to predict component failures," in International Conference on Software Engineering, 2006, pp. 452--461. Google ScholarDigital Library
N. J. D. Nagelkerke, "A note on a general definition of the coefficient of determination," Biometrika, vol. 78, pp. 691--692, 1991.Google ScholarCross Ref
M. E. J. Newman, "The structure and function of complex networks," SIAM Review, vol. 45, pp. 167--456, 2003.Google ScholarDigital Library
M. Ohira, N. Ohsugi, T. Ohoka, and K.-i. Matsumoto, "Accelerating cross-project knowledge collaboration using collaborative filtering and social networks," in International Workshop on Mining Software Repositories St. Louis, Missouri: ACM Press, 2005. Google ScholarDigital Library
N. Ohlsson and H. Alberg, "Predicting fault-prone software modules in telephone switches," IEEE Transactions on Software Engineering, vol. 22, pp. 886 -- 894 1996. Google ScholarDigital Library
A. Orso, S. Sinha, and M. J. Harrold, "Classifying data dependences in the presence of pointers for program comprehension, testing, and debugging," ACM Transactions on Software Engineering and Methodology, vol. 13, pp. 199 -- 239 2004. Google ScholarDigital Library
T. Ostrand, E. Weyuker, and R. M. Bell, "Predicting the location and number of faults in large software systems," IEEE Transactions on Software Engineering, vol. 31, pp. 340 -- 355 2005. Google ScholarDigital Library
A. Pogdurski and L. A. Clarke, "A Formal Model of Program Dependences and its Implications for Software Testing, Debugging, and Maintenance," IEEE Transactions on Software Engineering, vol. 16, pp. 965--979, 1990. Google ScholarDigital Library
G. Sabidussi, "The centrality index of a graph," Psychometrika, vol. 31, pp. 581--603, 1966.Google ScholarCross Ref
A. Schröter, T. Zimmermann, and A. Zeller, "Predicting Component Failures at Design Time," in International Symposium on Empirical Software Engineering Rio de Janeiro, Brazil, 2006. Google ScholarDigital Library
S. Sinha, M. J. Harrold, and G. Rothermel, "Interprocedural control dependence," ACM Transactions on Software Engineering and Methodology, vol. 10, pp. 209 -- 254 2001. Google ScholarDigital Library
A. Srivastava, T. J., and C. Schertz, "Efficient Integration Testing using Dependency Analysis," Microsoft Research-Technical Report, MSR-TR-2005-94, 2005.Google Scholar
R. Subramanyam and M. S. Krishnan, "Empirical Analysis of CK Metrics for Object-Oriented Design Complexity: Implications for Software Defects.," IEEE Transactions on Software Engineering, vol. 29, pp. 297--310, 2003. Google ScholarDigital Library
G. Tassey, "The Economic Impacts of Inadequate Infrastructure for Software Testing," National Institute of Standards and Technology 2002.Google Scholar
S. Wasserman and K. Faust, Social Network Analysis: Methods and Applications. Cambridge: Cambridge University Press, 1984.Google Scholar
T. Zimmermann and N. Nagappan, "Predicting Subsystem Defects using Dependency Graph Complexities," in International Symposium on Software Reliability Engineering Trollhättan, Sweden, 2007. Google ScholarDigital Library

Index Terms

Recommendations

Predicting vulnerable software components with dependency graphs
MetriSec '10: Proceedings of the 6th International Workshop on Security Measurements and Metrics

Security metrics and vulnerability prediction for software have gained a lot of interests from the community. Many software security metrics have been proposed e.g., complexity metrics, cohesion and coupling metrics. In this paper, we propose a novel ...
Read More
Analyzing and predicting software integration bugs using network analysis on requirements dependency network

Complexity, cohesion and coupling have been recognized as prominent indicators of software quality. One characterization of software complexity is the existence of dependency relationships. Moreover, the degree of dependency reflects the cohesion and ...
Read More
Network Versus Code Metrics to Predict Defects: A Replication Study
ESEM '11: Proceedings of the 2011 International Symposium on Empirical Software Engineering and Measurement

Several defect prediction models have been proposed to identify which entities in a software system are likely to have defects before its release. This paper presents a replication of one such study conducted by Zimmermann and Nagappan on Windows Server ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ICSE '08: Proceedings of the 30th international conference on Software engineering
May 2008
558 pages
ISBN:9781605580791
DOI:10.1145/1368088
General Chair:
Wilhelm Schäfer
University of Paderborn
,
Program Chairs:
Matthew B. Dwyer
University of Nebraska
,
Volker Gruhn
University of Leipzig
Copyright © 2008 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 10 May 2008
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
defect prediction
dependency graph
network analysis
windows server 2003
Qualifiers
- research-article
Conference

Acceptance Rates
ICSE '08 Paper Acceptance Rate56of370submissions,15%Overall Acceptance Rate276of1,856submissions,15%
More

Upcoming Conference

ICSE 2025

2025 IEEE/ACM 46th International Conference on Software Engineering

April 26 - May 3, 2025

Ottawa , ON , Canada
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 416
  Total Citations
  View Citations
- 2,727
  Total Downloads
- Downloads (Last 12 months)101
- Downloads (Last 6 weeks)14
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Predicting defects using network analysis on dependency graphs

ICSE '08: Proceedings of the 30th international conference on Software engineering

ABSTRACT

References

Cited By

Index Terms

Recommendations

Predicting vulnerable software components with dependency graphs

Analyzing and predicting software integration bugs using network analysis on requirements dependency network

Network Versus Code Metrics to Predict Defects: A Replication Study