Neil Harris schrieb:
I definitely wouldn't recommend a flat triples store as the only storage representation.
Based on past experience with just such a system, while it's formally semantically equivalent to higher-level descriptions, it's definitely much harder to munge, because you have to reverse-engineer all the reification that was needed to flatten the data into triples in order to be able to see the higher-level patterns; it's much easier to just store the higher-level description in the obvious natural way, and generate the triples representation, and any other metadata output needed, from that.
True if you know the "obvious natural way" in andvance and can design a database schema for it. I don't think we can do that. We'll need a generic abstraction for stoiring structured (meta) data, so it can be used for all the different kinds of data we will get.
On the other hand, I see the problems with triple stores, especially wrt reification. Triples make this very clumsy, and it's something we will need once we cant to map infoboxes. We need it because a lot of the statements given in infoboxes are qualified: they have a source, a unit of measurement, an error margin, a point in time or some other meta-statement attached. I don't have a good solution for this right now, but I do think we should consider it.
-- daniel