Dirk Riehle's Industry and Research Publications

Why I gray-listed GitHub for open source

Most of my software development is through my professorship, where I guide my student teams in developing (mostly) open source software. We have clear rules in place for how and which open source can be used in our projects and which can’t, like any competent organization. Mostly, it is about license compliance. We owe this to the users of our open source projects as well as our industry partners.

As a small organization, we rely on rules rather than lengthy approval processes, component repositories, and the like. One rule is to look at the source (location) of the open source project and see whether we have it white-listed, gray-listed, or black-listed. The Apache Software Foundation website is white-listed and Stackoverflow is black-listed. GitHub is gray-listed, meaning “it depends”.

GitHub is gray-listed, because it is part of the (legally speaking) Wild West of open source. I require that my students make a case for using a GitHub project, including an in-depth look at its history. In general, projects under corporate governance (Google, Facebook) are fine, as the committers have been trained in good open source hygiene.

Vicky Brasseur recently pointed out a case of a GitHub-based project where the maintainer removed the original MIT license and relicensed under a new license. There is nothing wrong with relicensing going forward as long as the original licenses are observed. However, removing an original license is usually a big no-no. You should only do this if you have documented agreement by all past contributors whose code still is in the project. This is usually unmanageable.

The problem that the example creates is that an unassuming user of the project will not know, by the latest release, what the license obligations are. Looking at current copyright trolls, their attack vector is usually not copyleft, but (missing) attribution. Should any of the original contributors to the example project get a taste of the financial spoils of shaking down a company which did not observe this contributor’s rights, the users of the project are in danger.

The specific project, after some discussion, seems to have come around. The discussion was triggered by Richard Fontana, lawyer for Red Hat, who suggested an economic significance of the project by commenting. I feel confident predicting that there is a long tail of less well known projects on GitHub that have been (usually not deliberately) set-up to make its users violate its license when using it. A bright future for license compliance experts and lawyers, indeed.

GitHub (Microsoft), the organization, I’m sure is aware of this problem. Of course they are trying to pull their feet from being held responsible for gross neglect of their users, like the one listed above. Still, I look forward to more education and tooling by GitHub to help avoid these problems in the future.


Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.



Share the joy

Share on LinkedIn

Share by email

Share on X (Twitter)

Share on WhatsApp

Featured startups

QDAcity makes qualitative research and qualitative data analysis fun and easy.
EDITIVE makes inter- and intra-company document collaboration more effective.

Featured projects

Making free and open data easy, safe, and reliable to use
Bringing business intelligence to engineering management
Making open source in products easy, safe, and fun to use