Dirk Riehle's Industry and Research Publications

The Innovations of Open Source

Dirk Riehle, Friedrich-Alexander-Universität Erlangen-Nürnberg

In IEEE Computer vol. 52, no. 4, pp. 59-63.

Open source has given us many innovations. This article provides an overview of the most important innovations and illustrates the impact that open source is having on the software industry and beyond.

The main innovations of open source can be grouped into four categories: Legal innovation, process innovation, tool innovation, and business model innovation. Probably the best known innovations are open source licenses, which also define the concept.

1. Definition

Software becomes open source software if it is provided to its users under an open source license. For a software license to be an open source license, it must fulfill the ten requirements set forth by the Open Source Initiative, the protector and arbiter of what constitutes open source [1]. Most notably, the license must allow

  • Free-of-charge use of the software,
  • Access to and modification of the source code, and
  • To pass on the source code and a binary copy.

Before there was open source software, there was free software, as defined by Richard Stallman. Stallman defined the four freedoms of software that make software “free”, which are

“the freedom to run the program as you wish, for any purpose […], the freedom to study how the program works, and change it so it does your computing as you wish […], the freedom to redistribute copies so you can help others […], the freedom to distribute copies of your modified versions to others […]” [2].

Open source software and free software and the people behind them have struggled with each other at times, but for all practical purposes, the difference is irrelevant to users. What matters is the license under which a user receives a particular software.

Subscribe!

2. Legal innovation

Licenses can be structured into permissions (the rights granted to a user), obligations (what is required to receive these rights), and prohibitions (what may not be done, for example, claiming that using the software implies an endorsement by its creator).

The two legal innovations are:

  1. The rights grant as introduced earlier and
  2. a particular obligation called “copyleft”.

The rights grant helped open source spread and succeed. As research has shown, it taps into the desire of humans to help each other and collaborate on interesting projects.

People sometimes ask why developers don’t just put their work into the public domain. This misses the point: By putting something into the public domain, an author typically waives their rights. Most authors don’t want that. Rather they want to be specific which rights they grant and which obligations they require.

The most famous license obligation is probably the copyleft clause. This clause was invented by Stallman and became popular through the GNU Public License v2 in 1991. Simplifying, it states that if you pass on copyleft-licensed code, for example as part of a product that you sell, you must also pass on your own source code, if it modifies the copyleft-licensed code. The specifics of this can get complicated quickly and they will be discussed in more detail in the next set of columns. Many companies worry that if their source codes gets mixed with copyleft-licensed code, they will lose their intellectual property and hence competitive advantage in the marketplace.

It is this clause that has been used by companies in the past to wrongly discredit open source software as “a virus” or “cancer” and a “communist” or “hippie undertaking”. Nobody forces anyone to use open source software.

In an amazing about-face, some of the most well-known companies that fought open source tooth and nail only fifteen years ago have come around and are now among its biggest supporters. The business model section of this article explains some of this.

3. Process innovation

The next innovation open source has brought us is engineering process innovation [3]. The open source initiative has this to say about open source software development:

“Open source is a development method for software that harnesses the power of distributed peer review and transparency of process. The promise of open source is better quality, higher reliability, more flexibility, lower cost, and an end to predatory vendor lock-in.” [1].

This is the other definition of open source, not focused on licenses and intellectual property, but on collaborative development. There is no single open source software engineering process, rather each open source community defines its own.

The first to have explored at scale a truly collaborative open source process was Linus Torvalds in the development of the Linux kernel. His approach has no particular name, but is often identified with his moniker, BDFL (benevolent dictator for life), implying a hierarchical structure. A core benefit of an open collaboration process was named after Torvalds and is called “Linus’ law”: “Given enough eyeballs, all bugs are shallow.” [4] The idea is that more broadly used software matures faster, because problems are found and solved faster.

A similar but different approach may be more popular today: The collaborative peer group, as explored by the original team of the Apache web server (httpd) and codified as The Apache Way (of open source software development) [5]. The software industry owes this group of developers as much as it owes Torvalds, if not more so.

The Apache Way is a “consensus-based, community driven governance” approach to collaboration in open source projects. The Apache Software Foundation’s website explains it in detail. An important aspect is the distinction between contributors, who submit work for inclusion in an open source project, and committers, who review and integrate the work. Committers are called maintainers in a Linux context, and they also usually are developers, not just reviewers.

Using this contributor-committer interplay, nearly all open source projects practice pre-commit code review to ensure the quality of the software under development.

I like to summarize the principles of open source software development as the three principles of open collaboration [6]: In open collaboration, participation is egalitarian (nobody is a priori excluded), decision-making is meritocratic (decisions are based on the merits of arguments rather than status in a corporate hierarchy), and people are self-organizing (people choose projects, processes, and tasks rather than get assigned to them).

Similarly, open source projects practice open communication. Open communication is public (everyone can see it), written (so you don’t have to be there when word is spoken), complete (if it wasn’t written down, it wasn’t said), and archived (so that people can look up and review discussions later on) communication.

Such open collaborative processes, not dominated by any single entity, lead to community open source software: Software that is collectively owned, managed, and developed by a diverse set of stakeholders. These collaboration processes are not limited to software but rather spill over into adjacent areas. For example, these processes have brought forward many formal and de-facto standards that the software industry relies on [3].

The methods of open source software development have also taken roots inside companies, where they are called inner source [7] [8].

4. Tool innovation

Most tools used in open source software development are familiar to closed source programmers as well. That said, the needs of open source processes have led to two major tool innovations that have since become an important part of corporate software development as well:

  • Software forges
  • Distributed version control

A software forge is a website that allows the creation of new projects and provides developers with all the tools needed for software development: A homepage, an issue tracker, version control, etc.

What makes software forges special is that they facilitate the matchmaking between those who are looking to find a useful software component and those who are offering one. Software forges are an enterprise software product category, because even within one company do you want one place where to find all software being developed.

Distributed version control is version control where you copy the original repository and work with your copy. Thus, you don’t need commit rights and don’t need to ask for permissions to start work. Git and Mercurial are the two best known examples of such software. Some may argue that distributed version control is not an open source innovation, because of some of its roots in proprietary software. However, the open source community developed and refined its own solutions, which work well with the way how open source software is being developed, and thereby popularized the concept.

Comparing distributed version control with branching misses the point. Having your own repository lets a developer work using their own style, free of any centralized decisions on how to use branches.

Distributed version control was further helped by being the main version control software underlying a new generation of software forges, most notably Github and Gitlab. As such, companies are adopting both forges and distributed version control at a rapid pace.

5. Business model innovation

Laying the legal foundation for open collaboration between individuals and companies, defining more effective collaboration processes with higher productivity than closed-source approaches, and inventing the tools to support it was just the beginning: Open source is changing the software industry by way of how it makes new and breaks old business models. Open source itself may not a business model, but it is a potent strategy and tool to use in a competitive environment.

5.1 For-profit models

There are different approaches to classifying business models enabled by open source, but I like to put them into five categories. Three of them are for-profit business models, and two of them are non-profit models. The for-profit business models are:

  1. Consulting and support business models. In this conventional model, a company earns its money by providing consulting and support services for existing open source software. They don’t sell a license, but they service the software nevertheless. The original open source service company was Cygnus Solutions, which serviced the GNU set of tools, a more recent example are Cloudera and Hortonworks, which service Hadoop.
  2. The distributor business model. In this unique to open source business model, a company sells (subscription to) software and associated services that partly or completely are based on open source software. This model only works for complex software, consisting of tens or hundreds and sometimes thousands of possibly incompatible components that a customer wants to use.
    The most well-known examples are the Linux distributors like Red Hat and SUSE, but there are many other smaller companies providing distributions of other kinds. The competitively differentiating intellectual properties of a distributor are its test suites, configuration databases, compatibility matrices, etc. that they typically don’t open source.
  3. The single-vendor open source business model. In this model, a company goes to market by providing a sometimes reduced, sometimes complete version of its product as open source. The company never lets go of full ownership of the software. It then sets up various incentives for users to move from the free open source version to a paid-for, commercially licensed version. The most common incentives are support and update services, but it is often also a copyleft license that users would like to replace with a proprietary one.
    Done right, the company and its products benefit from the help of the community of non-paying users. The company typically does not get code contributions, but it does get lively discussion forums, bug reports, feature ideas, word-of-mouth marketing, etc.
    The most well-known example of this model was MySQL, the database company, but there are many more recent ones, for example, SugarCRM, MongoDB, and Redis Labs.

The distributor and single-vendor model are specifically important, because they enable returns on investment that are attractive to venture capitalists. Thus, they are the main conduit through which significant amounts of money have been invested into open source software.

5.2 Open source foundations

There are two more models of how the development of open source software is being funded. They are actually two variants of the same idea: The open source foundation.

An open source foundation is a non-profit organization tasked with governing one or more open source projects, representing them legally, and ensuring their future. In the past, open source foundations were set-up to ensure the survival of unsupported community open source projects, but increasingly we see companies coming together to set-up a foundation with the goal of developing new open source software.

The two variants of open source foundations are:

  1. Developer foundations. This type of non-profit foundation is run by software vendors (developers) who decided to join forces to ensure the survival and flourishing of the open source software they depend on.
    By ensuring broadly shared ownership of the software, the vendors ensure that nobody can monopolize this particular type of component and reap all the profits from software products that rely on it. This is the reason why Linux was supported against Microsoft Windows, Eclipse against Microsoft Visual Studio, and more recently, OpenStack against Amazon Web Services.
  2. User foundations. This type of non-profit is predominantly run by companies who are not software vendors but rely on the software managed by the foundation, either as part of their operations or directly as part of a product that is only partly software. Examples are the Kuali Foundation for software to run universities, the GENIVI foundation for automotive infotainment software, and the openKONSEQUENZ foundation for software for the (German) smart energy grid the last of which I helped create.

Figure 1 shows how replacing a closed source component in a product with an open source component shifts profits between the different suppliers of components and generally leaves more profit for the vendor who integrates the components and sells the final product.

Figure 1: The economic logic of community open source

Because of this economic logic, I expect to see more product vendors and services suppliers from outside the software industry get in on the game. They will fund the development of open source components they need, taking money out of the market for this type of component and moving it to places where they can more easily appropriate it. Therefore, we can expect funding for open source software development to increase by a couple of orders of magnitude in the future.

References

[1] See https://opensource.org/

[2] See https://www.gnu.org/philosophy/free-sw.en.html

[3] Ebert, C. (2007). Open source drives innovation. IEEE Software, 24(3).

[4] Raymond, E. (1999). The cathedral and the bazaar. Knowledge, Technology & Policy, 12(3), 23-49.

[5] See https://www.apache.org/foundation/how-it-works.html

[6] Riehle, D. (2015). The Five Stages of Open Source Volunteering. In Crowdsourcing. Li, Wei; Huhns, Michael N.; Tsai, Wei-Tek; Wu, Wenjun (Editors). Springer-Verlag, 2015, 25-38.

[7] Dinkelacker, J., Garg, P. K., Miller, R., & Nelson, D. (2002, May). Progressive open source. In Proceedings of the 24th International Conference on Software Engineering (pp. 177-184). ACM.

[8] Riehle, D., Capraro, M., Kips, D., & Horn, L. (2016). Inner Source in Platform-Based Product Engineering. IEEE Transactions on Software Engineering vol. 42, no. 12 (December 2016), 1162-1177.