Open Source Distributions by Life-Cycle

An open source distribution is a set of open source components configured and put together to work well as one piece of software. A commercial open source distribution is a product that you pay for, and a non-commercial distribution is freely available software. Commercial distributions may be complex products, but not all complex products are distributions.

I’ve been interested in the business of open source distributions (and distributor firms) for a while, but I mostly focused on Linux. Thanks to the Twitterverse I got a boatload of other examples to look at. Here, I want to review and classify them by life-cycle stage. The following graphics displays a simple classification model.

A simple life-cycle-based classification model of open source distributions

A complex piece of software can be in any of the stages early, growth, or mature. It can also either be a complex product (but not a commercial distribution), a commercial distribution (which is a complex product), or a non-commercial distribution (which is not a product).

The following graphics displays a sampling of distributions based on the above model.

A sampling of distributions, including those that devolved into plain products

The early stage

An open source project in an early stage is often not complex enough to be turned into a distribution. If someone starts building a distribution, it probably moved into growth stage already. Alternatively, if the complexity is just not large enough to warrant a commercial distribution, but there is business to be had, it might become a regular product.

The growth stage

As the model shows, in the growth stage I observe a split of complex products into commercial distributions and those that are not. Somewhat ad-hoc, I’d argue that any open-source-based product that contains more than 5% of closed source is not an open source distribution.

The prime present example of an ecosystem growing at a rapid pace is Kubernetes. Three examples of commercial distributions are OpenShift (Red Hat), Mesosphere Kubernetes Engine (D2IQ), and Kubermatic Kubernetes Platform (Kubermatic). OpenStack is or was in a similar situation.

Of interest also are those that started as an open source project, but due to limited component configuration complexity became a regular product. Most notable, at present, are Apache Kafka and Apache Lucene/Solr. In the case of Kafka, Confluent has acquired, pushed-out, or side-lined most of the competition (e.g. Aiven, instaclustr), and in the case of Lucene/Solr, Elastic has done the same (e.g. Lucidworks, Swifttype).

Beyond ROS, I could not think of any non-commercial open source distributions, but maybe my mind is blocked right now. Pointers are welcome in the comments section.

The mature stage

The prime example of an open source distribution in a mature stage is and remains Linux. It exists in plenty of commercial (RHEL, SLES, Univention Corporate Server) and non-commercial distributions (Debian, Fedora, OpenSUSE). Perhaps the oldest example of an (almost) open source project that spawned non-commercial open source distributions is TeX (TeX Live, MacTeX, MiKTeX).

Two other examples of open source projects with distributions are OpenJDK and Python. Both come in commercial and non-commercial variants. Oracle JDK and Zulu (Azul Systems) are commercial distributions of OpenJDK, and Amazon Corretto and AdoptOpenJDK are non-commercial ones. ActivePython (ActiveState) and Anaconda (Continuum Analytics) are commercial distributions of Python, and CPython and WinPytyon are non-commercial ones.

Examples of open source projects that did not have enough configuration complexity to spawn a mature ecosystem of distributions are Hadoop and Drupal. In the case of Hadoop, Cloudera won most of the wars (Hortonworks, MapR) and so did Acquia. This is not to say that there can’t be plain non-commercial open source distributions, but they seem to play a lesser role in these ecosystems.

Into the cloud

In many cases, distributions are moving into the cloud to turn the revenue streams from a license sale into a (more lucrative) subscription service. It will be a challenge to vendors to maintain the promise of providing a distribution rather than a regular product, because to make distributions work at scale in your data centers, you often have to instrument and change them, to the extent that it won’t always be easy for users to get their applications to run elsewhere. I like the topic of lock-in, and how talking about lock-in as solely an IP (source code lock-in) is too easy in my opinion, but this is a topic for another time.

2 Replies to “Open Source Distributions by Life-Cycle”

  1. Dear Dirk Riehle,
    I understand, that in the age of „Cloud“ subscription services are becoming more important and license sales are becoming less and less.
    I can understand that for the success of such subscription models it is crucial that distributions work at scale on the offering server and that the application cannot be trivially made to work elsewhere.
    But what is meant by “lock-in” or “IP source code lock-in” in this context?
    Do you mean that the users of a OS-based cloud service do not fork such a OS-based cloud service but like to
    continue using/subscribing/abonnieren/einrasten/locking-in to/…
    the existing OS-based cloud Service,
    and that such OS-based cloud service is also worth a certain subscription fee (paid by such users), e.g. based on a service level agreement.
    Or do you mean with “lock-in” something else?
    BR Dirk Feuchter

    1. Hi Dirk (fabulous first name!!)

      I think you are referring to my final section “into the cloud”? Of course vendors like the cloud more than the traditional license sale because they can charge more, presumably at a higher margin.

      There is a difference in software that potentially leads to new forms of lock-in. In a license sale, you sell (a license to) the distribution. In a subscription, you sell access to a distribution (first part) adapted to run in the vendor’s cloud (second part). That second part may create a lock-in very overtly or very subtly and probably both. If the vendor can, they’ll add extra proprietary features that create hard lock-in. But already subtle performance behavior variation from the cloud to a locally operated distribution may throw applications off balance and present a (migration away from cloud service) risk to customers.

      Cheer,s Dirk

Leave a Reply