We just finished redoing our original analysis of paid vs. volunteer work in open source for Gitee, a Chinese-dominated code hosting platform from China. We wanted to understand where China stands in open source. Previous blog posts looked at base data, e.g. the half/half split between paid and volunteer work, as well as developer behavior, e.g. that dominantly paid developers still volunteer in their spare time.
In this third and final blog post, I would like to look at projects and how commercially dominated (or not) they are. For the purposes of this analysis, a developer is a (pure) paid developer, if 95% or more of their commits are done during regular working hours, and a developer is a (pure) volunteer, if 95% or more of their commits are done outside of these working hours. Obviously, this is a very conservative definition. How commercial a project is then depends on the percentage of (pure) paid developers and how non-commercial depends on the percentage of (pure) volunteer developers. The following figures shows how many projects exist for the percentage distributions of either pure paid or pure volunteer developers. Please observe the logarithmic y-axis.
We just finished redoing our original analysis of paid vs. volunteer work in open source for Gitee, a Chinese-dominated code hosting platform from China. We wanted to understand where China stands in open source. The previous blog post explained the half / half split between paid vs. volunteer time in terms of total work on open source.
So far, we only discussed commits, now I would like to discuss committer behavior, in particular, whether there are pure paid developers, who only work Mon-Fri, 9am-5pm, i.e. during regular working hours, and pure volunteers, working only outside those hours. Compared to our data for the Western world, the Chinese data is less conclusive. The following figure bins developers into the respective categories, and the following table spells out the bins (categories) explicitly. For the figure, please note the logarithmic scale of the y-axis.
In 2014 we published a study on paid vs. volunteer work in open source, using a representative sample of open source projects from 2008 (i.e. before GitHub). In 2008, open source activity was decidedly Western, with little contributions from China. In 2017, I finally found a student to redo the analysis for China. More specifically, the student was to use what we had identified as the most popular Chinese language code hosting platform and perform the same analysis we had done years earlier. In this sequence of blog posts, I’ll present some of his results. The full thesis can be found on my research group’s blog.
The analysis is based on data from Gitee, a Chinese-language code hosting platform hosted in China, and one of the leading platforms. A first interesting piece of data is that despite its decidedly Chinese focus, 22.4% of all committers to Gitee projects work overseas. They may well be Chinese (at least they are capable of reading and writing Chinese), and I find this number surprisingly large, but we don’t know more than that.
Most interestingly, but perhaps not surprisingly, the weekly work pattern on Gitee is similar to the one in the Western world. The following figure displays this work rhythm. As we can see, work intensity is highest Monday to Friday during regular working hours, similar to Western work patterns.
The open source working group of Bitkom, a German IT association, has prepared a short survey on open source compliance in companies. My research group supports the survey. If you are interested, please take the survey (in German).
Open source is a viable business strategy for software vendors to disrupt existing markets and conquer new ones. Just why is it easy in some markets and hard in others? I argue that you need to cut the product in such a way that there is a clear separation between what a never-paying community-user wants and what a commercial customer needs. In addition, you need to tie the commercial features closely to your company’s intellectual property and capabilities to keep competitors at bay. If you can do that, you are in the right place. If you can’t, you may want to get out of there.
Consulting company PTA reports about its development of open source software for the German energy software user consortium openKONSEQUENZ, which sponsors and manages the development of open source software for the energy sector. The Netzpraxis article start out with:
Auf der openKONSEQUENZ-Plattform steht seit kurzem Unternehmen der Energie- und Wasserwirtschaft das Modul »Betriebstagebuch« zur Verfügung. Da es sich bei penKONSEQUENZ um eine Genossenschaft i.G. und beim Betriebstagebuch um eine Open-Source-Lösung handelt, können es Netzbetreiber und andere interessierte Unternehmen kostenlos nutzen.
tl;dr: Existing foundations need a new kind of incubator to capture budding user consortia.
An open source user consortium is a consortium of companies who sponsor, steer, and possibly also develop open source software for their own use rather than as part of software products they sell. As explained previously, this phenomenon may not be widely understood yet, but the opportunity is large. The user consortia and their members stand to benefit, and so do those existing open source foundations that are able to capture this thrust and prevent the creation of separate consortia but rather manage to integrate these interests with their own governance structure.