In five posts, I want to speculate about the next twenty years of open data based on the past twenty years of open source. The idea is to transfer what we learned from open source in one way or another to open data.
This is part 1 on the definition of open data.
Please note that beyond this intellectual exercise, some of this is already here and some of it is simply different.
Definition and redefinition
Open data is data that is given to anyone under a license that allows free access, use, modification, and sharing for any purpose (my adaptation of the OKFN’s open definition, which in turn reads like an adaptation of Stallman’s four freedoms or the open source definition). Lots of people, in particular in business, remain confused and don’t know about this definition or are trying to redefine it. I suspect redefinition attempts will fail, like they have failed in open source. They may create considerable strain along the way, though.
Omission of process
The open source definition, like the open data definition, failed to include an open collaborative process as part of the definition. The definition is purely about the actual resource and the rights to it. As a consequence, in open source we see projects that may legally be considered open source, but aren’t in spirit. Similarly, open data projects that aren’t collaborative may well be constrained in ways that hinder their growth and evolution. This is likely to create strain and frustration in their communities, and may lead to forks.
I’m writing this blog post while sitting in a meeting of an open data working group; the working group was started in 2016. I attended in total three times, and this meeting like the others I participated in before started over from scratch. So I expect that twenty years from now, we will still be explaining what open data is to people who are confused about it.
Next up: Using open data.