research-article

Can developer-module networks predict failures?

Authors:
Martin Pinzger

University of Zurich, Zurich, Switzerland

University of Zurich, Zurich, Switzerland
View Profile

,
Nachiappan Nagappan

Microsoft Research, Redmond

Microsoft Research, Redmond
View Profile

,
Brendan Murphy

Microsoft Research, Cambridge, UK

Microsoft Research, Cambridge, UK
View Profile

SIGSOFT '08/FSE-16: Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of software engineeringNovember 2008Pages 2–12https://doi.org/10.1145/1453101.1453105

Published:09 November 2008Publication History

SIGSOFT '08/FSE-16: Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of software engineering

Pages 2–12

ABSTRACT

Software teams should follow a well defined goal and keep their work focused. Work fragmentation is bad for efficiency and quality. In this paper we empirically investigate the relationship between the fragmentation of developer contributions and the number of post-release failures. Our approach is to represent developer contributions with a developer-module network that we call contribution network. We use network centrality measures to measure the degree of fragmentation of developer contributions. Fragmentation is determined by the centrality of software modules in the contribution network. Our claim is that central software modules are more likely to be failure-prone than modules located in surrounding areas of the network. We analyze this hypothesis by exploring the network centrality of Microsoft Windows Vista binaries using several network centrality measures as well as linear and logistic regression analysis. In particular, we investigate which centrality measures are significant to predict the probability and number of post-release failures. Results of our experiments show that central modules are more failure-prone than modules located in surrounding areas of the network. Results further confirm that number of authors and number of commits are significant predictors for the probability of post-release failures. For predicting the number of post-release failures the closeness centrality measure is most significant.

References

V. R. Basili, L. C. Briand, and W. L. Melo. A Validation of Object-Oriented Design Metrics as Quality Indicators. IEEE Transactions on Software Engineering, 22(10), pp. 751--761, 1996. Google ScholarDigital Library
V. Basili, F. Shull, and F. Lanubile. Building Knowledge through Families of Experiments, IEEE Transactions on Software Engineering, 25(4), pp. 456--473, 1999. Google ScholarDigital Library
A. Bernstein, J. Ekanayake, and M. Pinzger. Improving defect prediction using temporal features and non linear models. In Proceedings of the International Workshop on Principles of Software Evolution (IWPSE), pp. 11--18, IEEE CS Press, 2007. Google ScholarDigital Library
C. Bird, A. Gourley, P. Devanbu, A. Swaminathan, and G. Hsu. Open Borders? Immigration in Open source Projects. In Proceedings of the International Workshop on Mining Software Repositories (MSR), pp. 6, IEEE CS Press, 2007. Google ScholarDigital Library
P. Bonacich. Power and Centrality: A family of Measures. American Journal of Sociology 92, pp. 1170--1182, 1987.Google ScholarCross Ref
S. P. Borgatti, M. G. Everett, and L. C. Freeman. Ucinet for Windows: Software for Social Network Analysis. Analytic Technologies, Harvard, MA, 2002.Google Scholar
A. Cockburn. Agile Software Development. Addison-Wesley, Boston, 2001. Google ScholarDigital Library
T. DeMarco and T. Lister. Peopleware: Productive Projects and Teams, 2^nd edition, Dorset House Publishing, New York, 1999. Google ScholarDigital Library
N. E. Fenton and N. Ohlsson. Quantitative Analysis of Faults and Failures in a Complex Software System. IEEE Transactions on Software Engineering, 26(8), pp. 797--814, 2000. Google ScholarDigital Library
N. Ducheneaut. Socialization in an Open Source Software Community: A Socio-Technical Analysis. Computer Supported Cooperative Work (CSCW), 14(4), pp. 323--368, 2005. Google ScholarDigital Library
N. E. Fenton and S. L. Pfleeger. Software Metrics: A Rigorous and Practical Approach. PWS Publishing Co., Boston, MA, USA, 1997. Google ScholarDigital Library
N. E. Fenton and M. Neil. A Critique of Software Defect Prediction Models. IEEE Transactions on Software Engineering, 25(5), pp. 675--689, 1999. Google ScholarDigital Library
L. C. Freeman. Centrality in social networks: Conceptual clarification. Social Networks, 1(3), pp. 215--239, 1979.Google ScholarCross Ref
R. A. Ghosh. Clustering and dependencies in free/open source software development: Methodology and tools. First Monday, 8(4), 2003.Google Scholar
T. L. Graves, A. F. Karr, J. S. Marron, and H. Siy. Predicting Fault Incidence Using Software Change History. IEEE Transactions on Software Engineering, 26(7), pp. 653--661, 2000. Google ScholarDigital Library
D. M. Green and J. M. Swets. Signal detection theory and psychophysics. John Wiley & Sons, Inc. New York, 1966.Google Scholar
R. A. Hanneman and M. Riddle. Introduction to social network methods. University of California, Riverside, California, 2005.Google Scholar
J. D. Herbsleb and A. Mockus. An Empirical Study of Speed and Communication in Globally Distributed Software Development. IEEE Transactions on Software Engineering, 29(6), pp. 481--494, 2003. Google ScholarDigital Library
J. Howison, K. Inoue, and K. Crowston. Social Dynamics of Free and Open Source Team Communication. In Proceedings of the International Conference on Open Source Software (OSS), pp. 319--330, 2006.Google ScholarCross Ref
S.-K. Huang and K.-m. Liu. Mining version histories to verify the learning process of Legitimate Peripheral Participants. In Proceedings of the International Workshop on Mining Software Repositories (MSR), pp. 1--5, ACM Press, 2005. Google ScholarDigital Library
W. S. Humphrey. TSP: Leading a Development Team. Addison-Wesley, Boston, 2006. Google ScholarDigital Library
J. E. Jackson. A User's Guide to Principal Components. John Wiley & Sons, Inc., New York, 1991.Google Scholar
S. Kim, T. Zimmermann, E. J. Whitehead, Jr., and A. Zeller. Predicting Faults from Cached History. In Proceedings of the International Conference on Software Engineering (ICSE), pp. 489--498, IEEE CS Press, 2007. Google ScholarDigital Library
P. Knab, M. Pinzger, and A. Bernstein. Predicting defect densities in source code files with decision tree learners. In Proceedings of the International Workshop on Mining Software Repositories (MSR), pp. 119--125, ACM Press, 2006. Google ScholarDigital Library
T. M. Khoshgoftaar, E. B. Allen, R. Halstead, G. P Trio, and R. M. Flass. Using Process History to Predict Software Quality. IEEE Computer, 31(4), pp. 66--72, 1998. Google ScholarDigital Library
L. Lopez-Fernandez, G. Robles, and J. M. Gonzalez-Barahona. Applying Social Network Analysis to the Information in CVS Repositories. In Proceedings of the International Workshop on Mining Software Repositories (MSR), pp. 101--105, 2004.Google ScholarCross Ref
T. Menzies, J. Greenwald, and A. Frank. Data Mining Static Code Attributes to Learn Defect Predictors. IEEE Transactions on Software Engineering, 33(1), pp. 2--13, 2007. Google ScholarDigital Library
A. Mockus and D. M. Weiss: Predicting risk of software changes. Bell Labs Technical Journal, 5(2), pp. 169--180, 2000.Google ScholarCross Ref
R. Moser, W. Pedrycz, and G. Succi. A comparative analysis of the efficiency of change metrics and static code attributes for defect prediction. In Proceedings of the International Conference on Software Engineering (ICSE), pp. 181--190, IEEE CS Press, 2008. Google ScholarDigital Library
J. C. Munson and T. M. Khosghoftaar. The Detection of Fault-Prone Programs. IEEE Transactions on Software Engineering, 18(5), pp. 423--433, 1992. Google ScholarDigital Library
J. C. Munson and S. G. Elbaum. Code Churn: A Measure for Estimating the Impact of Code Change. In Proceedings of the International Conference on Software Maintenance (ICSM), pp. 24--32, IEEE CS Press, 1998. Google ScholarDigital Library
N. Nagappan, T. Ball, and A. Zeller. Mining metrics to predict component failures. In Proceedings of the International Conference on Software Engineering (ICSE), pp. 452--461, ACM Press, 2006. Google ScholarDigital Library
N. Nagappan and T. Ball. Using Relative Code Churn Measures to Predict System Defect Density. In Proceedings of the International Conference on Software Engineering (ICSE), pp. 284--292, ACM Press, 2005. Google ScholarDigital Library
N. Nagappan, B. Murphy, and V. Basili. The influence of organizational structure on software quality: an empirical case study. In Proceedings of the International Conference on Software Engineering (ICSE), pp. 521--530, IEEE CS Press, Leipzig, Germany, 2008. Google ScholarDigital Library
M. Ohira, N. Ohsugi, T. Ohoka, K. Matsumoto. Accelerating cross-project knowledge collaboration using collaborative filtering and social networks. In Proceedings of the International Workshop on Mining Software Repositories (MSR), pp. 1--5, ACM Press, 2005. Google ScholarDigital Library
T. J. Ostrand, E. J. Weyuker, and R. M. Bell. Predicting the Location and Number of Faults in Large Software Systems. IEEE Transactions on Software Engineering, 31(4), pp. 340--355, 2005. Google ScholarDigital Library
J. Ratzinger, M. Pinzger, and H. C. Gall. EQ-Mine: Predicting short-term defects for software evolution. In Proceedings of the Fundamental Approaches to Software Engineering (FASE), pp. 12--26, LNCS, Springer, 2007. Google ScholarDigital Library
W. Sack. Conversation Map: An Interface for Very Large-Scale Conversations. Journal of Management Information Systems, 17(3), pp. 73--92, 2001. Google ScholarDigital Library
W. Sack, F. Détienne, N. Ducheneaut, J.-M. Burkhardt, D. Mahendran, and F. Barcellini. A Methodological Framework for Socio-Cognitive Analysis of Collaborative Design of Open Source Software. Computer Supported Cooperative Work (CSCW), 15(2), pp. 229--250, 2006. Google ScholarDigital Library
S. Wasserman and K. Faust. Social Network Analysis: Methods and Applications. Cambridge University Press, New York, 1994.Google ScholarCross Ref
E. J. Weyuker, T. J. Ostrand, and R. M. Bell. Using Developer Information as a Factor for Fault Prediction. In Proceedings of the International Workshop on Predictor Models in Software Engineering (PROMISE), pp. 8--14, IEEE CS Press, 2007. Google ScholarDigital Library
J. Xu, Y. Gao, S. Christley, and G. Madey. A Topological Analysis of the Open Souce Software Development Community. In Proceedings of the Annual Hawaii International Conference on System Sciences (HICSS), pp. 198, IEEE CS Press, 2005. Google ScholarDigital Library
T. Zimmermann and N. Nagappan. Predicting defects using network analysis on dependency graphs. In Proceedings of the International Conference on Software Engineering (ICSE), pp. 531--540, IEEE CS Press, 2008. Google ScholarDigital Library

Index Terms

Recommendations

Predicting failures with developer networks and social network analysis
SIGSOFT '08/FSE-16: Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of software engineering

Software fails and fixing it is expensive. Research in failure prediction has been highly successful at modeling software failures. Few models, however, consider the key cause of failures in software: people. Understanding the structure of developer ...
Read More
Being Central on the Cheap: Stability in Heterogeneous Multiagent Centrality Games
AAMAS '22: Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems

We study strategic network formation games in which agents attempt to form (costly) links in order to maximize their network centrality. Our model derives from Jackson and Wolinsky's symmetric connection model, but allows for heterogeneity in agent ...
Read More
Privacy-Preserving Vital Node Identification in Complex Networks: Evaluating Centrality Measures under Limited Network Information
ASONAM '23: Proceedings of the 2023 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining

Identifying vital nodes in complex networks is pivotal across various research domains such as social network analysis, epidemiology, and physics, with centrality measures being commonly employed. Despite growing privacy concerns, its impact on vital ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGSOFT '08/FSE-16: Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of software engineering
November 2008
369 pages
ISBN:9781595939951
DOI:10.1145/1453101
General Chair:
Mary Jean Harrold
Georgia Institute of Technology
,
Program Chair:
Gail C. Murphy
University of British Columbia
Copyright © 2008 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 9 November 2008
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
developer contribution network
failure prediction
network centrality measures
social network analysis
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate17of128submissions,13%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 190
  Total Citations
  View Citations
- 1,404
  Total Downloads
- Downloads (Last 12 months)46
- Downloads (Last 6 weeks)6
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Can developer-module networks predict failures?

SIGSOFT '08/FSE-16: Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of software engineering

ABSTRACT

References

Cited By

Index Terms

Recommendations

Predicting failures with developer networks and social network analysis

Being Central on the Cheap: Stability in Heterogeneous Multiagent Centrality Games

Privacy-Preserving Vital Node Identification in Complex Networks: Evaluating Centrality Measures under Limited Network Information