skip to main content
10.1145/2804360.2804366acmconferencesArticle/Chapter ViewAbstractPublication PagesfseConference Proceedingsconference-collections
research-article

Revisiting the applicability of the pareto principle to core development teams in open source software projects

Published:30 August 2015Publication History

ABSTRACT

It is often observed that the majority of the development work of an Open Source Software (OSS) project is contributed by a core team, i.e., a small subset of the pool of active devel- opers. In fact, recent work has found that core development teams follow the Pareto principle — roughly 80% of the code contributions are produced by 20% of the active developers. However, those findings are based on samples of between one and nine studied systems. In this paper, we revisit prior studies about core developers using 2,496 projects hosted on GitHub. We find that even when we vary the heuristic for detecting core developers, and when we control for system size, team size, and project age: (1) the Pareto principle does not seem to apply for 40%-87% of GitHub projects; and (2) more than 88% of GitHub projects have fewer than 16 core developers. Moreover, we find that when we control for the quantity of contributions, bug fixing accounts for a similar proportion of the contributions of both core (18%-20%) and non-core developers (21%-22%). Our findings suggest that the Pareto principle is not compatible with the core teams of many GitHub projects. In fact, several of the studied GitHub projects are susceptible to the “bus factor,” where the impact of a core developer leaving would be quite harmful.

References

  1. T. Bissyande, D. Lo, L. Jiang, L. Reveillere, J. Klein, and Y. le Traon. Got issues? who cares about it? a large scale investigation of issue trackers from github. In Proc. Int’l Symposium on Software Reliability Engineering (ISSRE), pages 188–197, Nov 2013.Google ScholarGoogle Scholar
  2. V. Cosentino, J. L. C. Izquierdo, and J. Cabot. Assessing the bus factor of git repositories. In Proc. Int’l Conf. on Software Analysis, Evolution, and Reengineering (SANER), pages 499–503, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  3. K. Crowston, K. Wei, Q. Li, and J. Howison. Core and periphery in free/libre and open source software team communications. In Proc. Hawai’i Int’l Conf. on System Science (HICSS), 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. L. Dabbish, C. Stuart, J. Tsay, and J. Herbsleb. Social coding in github: Transparency and collaboration in an open software repository. In Proc. Conf. on Computer Supported Cooperative Work (CSCW), pages 1277–1286, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. T. Dinh-Trong and J. Bieman. The freebsd project: a replication case study of open source development. IEEE Trans. on Software Engineering, 31(6):481–494, June 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. J. Geldenhuys. Finding the core developers. In Proc. of the 36th Euromicro Conference on Software Engineering and Advanced Applications, pages 447–450. IEEE Computer Society, Sept. 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. D. M. German. A study of the contributors of postgresql. In Proc. Int’l Workshop on Mining Software Repositories (MSR), pages 163–164, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. M. Goeminne and T. Mens. Evidence for the pareto principle in open source software activity. In Joint Porc. the 1st Int’l Workshop on Model Driven Software Maintenance and 5th Int’l Workshop on Software Quality and Maintainability, pages 74–82, 2011.Google ScholarGoogle Scholar
  9. G. Gousios. The ghtorrent dataset and tool suite. In Proc. Int’l Working Conf. on Mining Software Repositories (MSR), pages 233–236, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. G. Gousios, M. Pinzger, and A. v. Deursen. An exploratory study of the pull-based software development model. In Proc. Int’l Conf. on Software Engineering (ICSE), pages 345–355, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. G. Gousios, A. Zaidman, M.-A. Storey, and A. v. Deursen. Work practices and challenges in pull-based development: The integrator’s perspective. In Proc. Int’l Conf. on Software Engineering (ICSE), 2015. To appear.Google ScholarGoogle ScholarCross RefCross Ref
  12. L. Hattori and M. Lanza. On the nature of commits. In Proc. Int’l Conf. on Automated Software Engineering (ASE) - Workshops, pages 63–71, Sept 2008.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. K. Herzig and A. Zeller. The impact of tangled code changes. In Proc. Int’l Working Conf. on Mining Software Repositories (MSR), pages 121–130, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. A. Hindle, D. M. German, and R. Holt. What do large commits tell us?: A taxonomical study of large commits. In Proc. Int’l Working Conf. on Mining Software Repositories (MSR), pages 99–108, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Y. Jiang, B. Adams, and D. German. Will my patch make it? and how fast? case study on the linux kernel. In Proc. Int’l Working Conf. on Mining Software Repositories (MSR), pages 101–110, May 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. E. Kalliamvakou, G. Gousios, K. Blincoe, L. Singer, D. M. German, and D. Damian. The promises and perils of mining github. In Proc. Int’l Working Conf. on Mining Software Repositories (MSR), pages 92–101, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. S. Koch and G. Schneider. E↵ort, cooperation and coordination in an open source software project: Gnome. Information Systems Journal, 12(1):27–42, 2002.Google ScholarGoogle ScholarCross RefCross Ref
  18. A. Mockus, R. T. Fielding, and J. D. Herbsleb. Two case studies of open source software development: Apache and mozilla. ACM Trans. on Software Engineering and Methodology, 11(3):309–346, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. K. Nakakoji, Y. Yamamoto, Y. Nishinaka, K. Kishida, and Y. Ye. Evolution patterns of open-source software systems and communities. In Proc. Int’l Workshop on Principles of Software Evolution (IWPSE), pages 76–85, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. R. Purushothaman and D. Perry. Toward understanding the rhetoric of small source code changes. IEEE Trans. on Software Engineering, 31(6):511–526, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. E. S. Raymond. The Cathedral and the Bazaar: Musings on Linux and Open Source by an Accidental Revolutionary. O’Reilly & Associates, Inc., Sebastopol, CA, USA, 1st edition, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. F. Ricca, A. Marchetto, and M. Torchiano. On the difficulty of computing the truck factor. In Product-Focused Software Process Improvement, volume 6759 of Lecture Notes in Computer Science, pages 337–351. 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. G. Robles, J. Gonzalez-Barahona, and I. Herraiz. Evolution of the core team of developers in libre software projects. In Proc. Int’l Working Conf. on Mining Software Repositories (MSR), pages 167–170, May 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. G. Robles, S. Koch, J. M. González-Barahona, and J. Carlos. Remote analysis and measurement of libre software systems by means of the cvsanaly tool. In Proc. the 2nd ICSE Workshop on Remote Analysis and Measurement of Software Systems (RAMSS), pages 51– 55, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  25. M. Torchiano, F. Ricca, and A. Marchetto. Is my project’s truck factor low?: Theoretical and empirical considerations about the truck factor threshold. In Proc. Int’l Workshop on Emerging Trends in Software Metrics (WETSoM), pages 12–18, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Y. Ye and K. Kishida. Toward an understanding of the motivation open source software developers. In Proc. Int’l Conf. on Software Engineering (ICSE), pages 419– 429, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Revisiting the applicability of the pareto principle to core development teams in open source software projects

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      IWPSE 2015: Proceedings of the 14th International Workshop on Principles of Software Evolution
      August 2015
      78 pages
      ISBN:9781450338165
      DOI:10.1145/2804360

      Copyright © 2015 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 30 August 2015

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Upcoming Conference

      FSE '24

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader