The Commit Size Distribution of Open Source Software [HICSS 2009]

Authors: Oliver Arafat, Dirk Riehle

Abstract: With the growing economic importance of open source, we need to improve our understanding of how open source software development processes work. The analysis of code contributions to open source projects is an important part of such research. In this paper we analyze the size of code contributions to more than 9,000 open source projects. We review the total distribution and distinguish three categories of code contributions using a size-based heuristic: single focused commits, aggregate team contributions, and repository refactorings. We find that both the overall distribution and the individual categories follow a power law. We also suggest that distinguishing these commit categories by size will benefit future analyses.

Reference: In Proceedings of the 42nd Hawaiian International Conference on System Sciences (HICSS-42). IEEE Press, 2009. Page 1-8.

Available as a PDF file.

Posted on

Comments

  1. Craig Anslow Avatar

    I had an idea after reading a book and looks like you have already done it, nice work!
    http://softvis.wordpress.com/2011/03/13/bursts-the-hidden-pattern-behind-everything-we-do/
    Cheers Craig

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Share the Joy

Share on LinkedIn

Share by email

Share on Twitter / X

Share on WhatsApp

Featured Startups

QDAcity makes qualitative research and qualitative data analysis fun and easy.
EDITIVE makes inter- and intra-company document collaboration more effective.

Featured Projects

Making free and open data easy, safe, and reliable to use
Bringing business intelligence to engineering management
Making open source in products easy, safe, and fun to use