Trying to wrap my head around the Open Cloud Principles put out by the revamp of the Open Cloud Initiative, I’m happy to note that software engineering research has something to say to the challenges these principles will face.
Every real-world specification is an underspecification.
So, well, I say that, but I doubt that I’m the first one to have learned this from 30+ years of software engineering research. This principle leads us directly to the challenges anyone is facing who is trying to be truthful to the intentions behind the Open Cloud Principles.
The principles ask that all data be available using open formats and accessible through open interfaces, all based on open standards. If so, a cloud computing provider can call its services an open cloud. The intention is right. The issue is open standards, though. The hope is that you could completely specify format of and access to data and that it can be replicated by another cloud provider. Which is not going to play out that easily.
Given that every specification is an underspecification, any open standard will be an underspecification. It will be missing out on relevant aspects. It is unlikely to be the data layout; usually it is semantics and the meaning of the data. Between an SAP Business Suite and an underlying Oracle database, who controls the data? It is SAP, because its code realizes the interpretation of the data, not the plain storage.
If some specification is well-intentioned, it will simply not be complete enough. If a specification is ill-intentioned, all it will specify are a format for key/value pairs and leave the interpretation of such data to an application. Reading the principles does not make clear to me how to avoid such intentions. (It is probably not possible nor intended. Players who deliberately play badly will eventually be recognized as such.)
I don’t know but I’m assuming that the OCI is trying to address this issue by requiring an open source implementation for handling the data. This is the last bullet item in the definition of open standard. It is debatable whether this gets you around key/value pairs; I can imagine an open source library for handling key/value pairs that stops right where it gets interesting, i.e. the data gets interpreted. But lets assume that the open source library provides decent abstractions, e.g. object-oriented classes, whose implementation truthfully captures the semantics of the underlying domain concept. The principle of underspecification above stipulates that subtle semantics will escape those classes and will be caught by surrounding code interpreting the data. That code is unlikely to be available as open source as it is likely to be competitively differentiating.
The second problem is that application providers simply won’t stop with standardized data types. Have you ever tried to get two business units of some company to agree on the notion of “customer”? You won’t succeed. It is the reason why we have design patterns like Role Object. The definition of “customer” will differ between different companies and even between different business units of the same company. So you need to provide extension mechanisms and you are back to storage using key/value pairs and/or running client-specific code to properly interpret client-specific extensions.
The principles are well-intentioned and send people on the right road but they not a guarantee that you can take your data from one cloud to another.
A Pragmatic Response
It is not that this isn’t a known problem. Anyone who has worked on standardization efforts has run into this. You may think that the C programming language has been specified a long-time ago and is rock-solid. But that’s not true, it is still evolving, as the recent ambiguities around the volatile keyword showed. However, long-running standardization efforts do show a pragmatic way forward: Effective standardization is not paperwork, but is effective working groups—experts and community, debating and documenting specifications, and moving forward, mole-whacking the loopholes and bugs as they keep occurring. It is a never ending effort, but a necessary one.
I’m missing this notion of working group in the list of requirements for an open standard, but I’m sure it won’t take long for them to appear respectively get channeled there.