Data at Rest - but in which format?

mathiasp Monday April 26, 2021

I seem to recall a quote along the lines of "data at rest is the most beautiful thing", but I dannot find it right now, it drives me crazy ;)

Still, the mind set behind that maybe-quote is important for clojure design: data first. Model your problem adequately, create useful data structures that naturally express your domain and a lot of your work is done.

And Iike that. I think it is true. That's why I eschew object oriented design, why I prefer functional languages, logical languages, databases.

But: what use is that data, if it's in some proprietary format? Maybe it's close to the domain you#re modelling, maybe it elegantly expresses core concepts, maybe it's hight performance: but who will know 5 years from now? Who will know if you're gone?


That's why I think, every data model should be at least described in semantic web formats.
RDF as a graph data model can express most anything. The new extension RDF* allows to annotate propositions (triples), giving it at least the same expressiveness as property graphs.

For open-world domains, OWL can describe the heck out of your world, and if you're bound to closed-world thinking, SHACL gives you every possibility to constrain your data forms.

Since these are standards, work done here will keep more value than work sunk in proprietary formats.

I think any valuable data today should be expressed using these formalisms.

Trouble ahead

Obviously this is not a complete solution. Far from it. While OWL allows you to define a logiic-based semantic for your data models, we humans very often do not understand the words we're using.

That's why agile development is so successful: the fast feedback cycle allows all team members to ground their language in experience. That basically creates a shared semantic world view and language between the participants. (Think of the meaning of "customer" for your CEO, CIO, COO, a salex guy, supporters, your law firm. The word customer does not tell enough).

So, maybe the proprietary vs. standards-based data model does not matter that much. Maybe the only way to keep that knowledge in your business is by keeping the people in your business.

Which sounds irritatingly cloase to knowledge management...

It will take a long time to solve this, I assume.