Open Source License Inconsistencies on GitHub [TOSEM Journal]

Abstract: Almost all software, open or closed, builds on open source software and therefore needs to comply with the license obligations of the open source code. Not knowing which licenses to comply with poses a legal danger to anyone using open source software. This article investigates the extent of inconsistencies between licenses declared by an open source project at the top level of the repository, and the licenses found in the code. We analysed a sample of 1,000 open source GitHub repositories. We find that about half of the repositories did not fully declare all licenses found in the code. Of these, approximately ten percent represented a permissive vs. copyleft license mismatch. Furthermore, existing tools cannot fully identify licences. We conclude that users of open source code should not only look at the declared licenses of the open source code they intend to use, but rather examine the software to understand its actual licenses.

Keywords: Open source, open source licenses, license management, license conflicts

Reference: Wolter, T., Barcomb, A., Riehle, D. & Harutyunyan, N. (2022). Open Source License Inconsistencies on GitHub. In ACM Transactions on Software Engineering and Methodology (TOSEM), forthcoming.

The paper can be downloaded as a PDF file (local copy).

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: