Data modeling

Fremantle

· Cargo · MediaWiki · modeling · data · datasets · entity-attribute-value · spreadsheets ·

I know I've said it before, but I really do find data modeling within MediaWiki with Cargo is great fun, and happens with the quick speed of normal wiki editing. It's a lot nicer (for the bulk of cases) than attempting to build a bespoke application. (Of course, I should preface all of this with some caveats about what can be done, but blah blah that can all be taken as read).

It's better to come at things from the other end, of say a small set of data in a spreadsheet. This can in many cases be easily ported into the MediaWiki idea by creating a single wiki page for each row of data, with each page containing a call to a single template. This template is where almost everything else is done.

One weird thing about it is that it's sort of closer to a EVA schema than a normal table, because each record can (but really shouldn't) have different attributes. This is a good thing from a point of view of tracking changes over time to the data, because changes to attribute names as well as their values are tracked in the page history. Of course, it also means that one has to do more scripting for some types of data modification, but for the most part that's not very hard.

Today the thing I'm enjoying about it is that it's perfectly easy to set up a new template and table and things for any quite small dataset. That means that data is given the structure it needs, rather than being munged into some more generic schema. (I guess this will also probably come back to bite me one day too, because I'll have separate things that don't fit together! But oh well.)

The other weird bit about Cargo is that it almost does away with the need to have categories in MediaWiki. I'm not sure if that's a good thing or not. So far I'm loving the extra flexibility I get with 'keywords' that are usually tied to mainspace pages.