Open source license inconsistencies on GitHub [TOSEM Journal]

Abstract: Almost all software, open or closed, builds on open source software and therefore needs to comply with the license obligations of the open source code. Not knowing which licenses to comply with poses a legal danger to anyone using open source software. This article investigates the extent of inconsistencies between licenses declared by an open source project at the top level of the repository, and the licenses found in the code. We analysed a sample of 1,000 open source GitHub repositories. We find that about half of the repositories did not fully declare all licenses found in the code. Of these, approximately ten percent represented a permissive vs. copyleft license mismatch. Furthermore, existing tools cannot fully identify licences. We conclude that users of open source code should not only look at the declared licenses of the open source code they intend to use, but rather examine the software to understand its actual licenses.

Keywords: Open source, open source licenses, license management, license conflicts

Reference: Wolter, T., Barcomb, A., Riehle, D. & Harutyunyan, N. (2022). Open Source License Inconsistencies on GitHub. In ACM Transactions on Software Engineering and Methodology (TOSEM), vol. 32, no. 5 (2023). Article no. 110, pp 1-23.

The paper can be downloaded as a PDF file (local copy).

Posted on

Comments

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Share the Joy

Share on LinkedIn

Share by email

Share on Twitter / X

Share on WhatsApp

Featured Startups

QDAcity makes qualitative research and qualitative data analysis fun and easy.
EDITIVE makes inter- and intra-company document collaboration more effective.

Featured Projects

Making free and open data easy, safe, and reliable to use
Bringing business intelligence to engineering management
Making open source in products easy, safe, and fun to use