The Sweet Spot of Code Commenting in Open Source

In a large-scale study of active working open source projects we have found an average comment density of about 20% (= one comment line in five code lines). Given that much of open source remains volunteer work, we believe that a comment density of 20% represents the sweet spot of code commenting in open source projects: Neither are you over-documenting your code and hence wasting resources, nor are you under-documenting and thereby endangering your project.

Continue reading “The Sweet Spot of Code Commenting in Open Source”

The Comment Density of Open Source Software Code

Author: Oliver Arafat, Dirk Riehle

Abstract: The development processes of open source software are different from traditional closed source development processes. Still, open source software is frequently of high quality. Thus, we are investigating how open source software creates high quality and whether it can maintain this quality for ever larger project sizes. In this paper, we look at one particular quality indicator, the density of comments in open source software code. In a large-scale study of more than 5,000 projects, we find that active open source projects document their source code, and we find that the comment density is independent of team and project size, but not of project age. In future work, we intend to correlate comment density with project success or failure.

Reference: In Companion to Proceedings of the 31st International Conference on Software Engineering (ICSE 2009). IEEE Press, 2009. Page 195-198.

Available as a PDF file.

My Position on Privacy (Seven Things About Me)

Stormy Peters recently tagged me to post seven items about my life. This is a “viral” pyramid scheme; you are supposed to write these seven items and then tag seven other people to do the same. It is not the first time I got such a request; I also got tagged on Facebook to post 25 items about my life, and in general it is quite tempting to let your personal thoughts hang out on a blog like this.

I usually ignore such requests for reasons of privacy. Everything you do or say on the Internet can be used at some future point in time. The saying “on the Internet, nobody knows you are a dog” is completely wrong; on the Internet anyone with enough resources cannot only know you are a dog but can also know everything about you down to hereditary diseases—even things you may not know yourself. Or, as Scott McNealy is famous for saying: “You have no privacy. Get over it.”

Here then seven things about my take at privacy in the Internet age:

Continue reading “My Position on Privacy (Seven Things About Me)”

Call for Papers: Fourth Workshop on Wikis for Software Engineering

For your information, the fourth workshop on wikis for (in) software engineering. I’m on the program committee.


Fourth Workshop on “Wikis for Software Engineering”, May 16, 2009, at ICSE 2009, Vancouver, Canada, May 16-24, 2009

Submissions are due on January 26 (abstracts), February 2 (papers), 2009

Continue reading “Call for Papers: Fourth Workshop on Wikis for Software Engineering”

Six Easy Pieces of Quantitatively Analyzing Open Source Projects

I’ll be giving a talk at the Open Source Business Conference 2009 in San Francisco on March 24, 2009. The talk will present an easily accessible summary of our data-driven analytical work on how open source software development works. Here is the abstract:

For the first time in the history of software engineering, we can both broadly and deeply analyze the behavior and dynamics of software development projects. This has become possible because of open source, which is publicly developed software. In this presentation, I will discuss our recent findings about open source software, its development process, and programmer behavior. I also discuss the challenges we encountered when quantitatively mining software repositories for such insights.

Reference: Talk at OSBC 2009. San Francisco, CA: 2009.

Available as a PDF file.

Organizational Design and Engineering

Most readers of this blog are probably familiar with Conway’s Law. So named by Fred Brooks in the “Mythical Man-Month” and popularized by the saying “if you have four teams working on a compiler you will get a four-pass compiler.” This sociological observation stipulates that the social architecture of a corporation i.e. its organizational hierarchy determines the technical architecture of its products. My industry experience supports this observation and I made fun of it as early (for me) as 1996.

Now Rodrigo Magalhaes and Antonio Rito Silva of Technical University of Lisbon are expanding this and related observations into a full-blown research area called “organizational design and engineering”. You are invited to submit to and participate in

I’ll be helping as member of the IWODE ’09 program committee and as a member of the IJODE editorial board. Please find appended the Call for Papers for IWODE ’09 as a PDF file.

WikiSym 2009 Call for Papers (Submissions)

WikiSym 2009 Call for Papers

The International Symposium on Wikis and Open Collaboration

October 25-27, 2009, in Orlando, Florida, USA

In-cooperation with ACM SIGPLAN and ACM SIGWEB, co-located with ACM OOPSLA 2009, peer-reviewed and archived in the ACM Digital Library


The International Symposium on Wikis (WikiSym) is the premier conference dedicated to wikis and related open collaboration systems and processes.

Continue reading “WikiSym 2009 Call for Papers (Submissions)”

Every Complex System that Works Started Out as a Simple System that Worked

The title of this blog post is my paraphrasing of a “law” from the tongue-in-cheek but nevertheless somewhat serious book “Systemantics” by John Gall. I tracked it down through Grady Booch’s original OOAD book and it had been pointed out to me by Ralph Johnson.

What’s so special about this quote? Well, it frames an inconvenient truth rather nicely: That you can’t create a complex system from scratch. Rather you have to evolve it step-by-step from a simpler system. This clearly runs counter to the intuition of many top-level IT or R&D managers. Thus, the unabated stream of expensive software project failures.

Open source and wikis are great examples of this proverb because of their (typically initially at least) volunteer nature. If they don’t work, they’ll get deserted quickly. If they get deserted, they are dead, and you don’t want that, if you are working on the project. Hence, this mechanism keeps you focused on the value the project or wiki provides to its users.

Open Source Labor Economics…

…is not nearly as sexy a title for an industry talk as is “Open Source Hacker Careers” so it had to go. The result you can observe at the 2009 Open Source Meets Business conference in Nuremberg, Germany, on January 28th, 2009, when I will be giving a talk (almost) so named.

Open Source Software Developer Careers

Open source is changing how software is built and how money is made. Open source also defines a new developer career that is independent of the traditional career within companies. This talk discusses this new career and argues that it creates economic value for some while it makes life harder for others. Suggesting that such a career is worthwhile, the talk then discusses key skills that a developer should possess or train in order to be successful in open source projects.