Am 06.06.2016 um 15:40 schrieb Yuri Astrakhan:
Do you have any thoughts about the proposed data structure?
The structure looks sane and future-proof to me, but since it's all-in-one-blob, it'll be hard to scale it to more than a few ten thousand lines or so. I like this model, but if you want to go beyond that (DO we want to go beyond that?!) you will need a different approach, which may be incompatible.
One thing that should be specified very rigorously from the start are the supported data types, along with their exact syntax and semantics. Your example has string, number, boolean, and localized. So:
* what's the length limit for string? * what's the range and precision of number? Is it the same as for JSON? * does boolean only accept JSON primitives, or also strings? * what language codes are valid for localized? Is language fallback applied for display?
Not answering these questions now may lead to having data that can later no longer be properly interpreted. If you get into quantities with precision or date, this becomes a lot more fun. In that case, you would want to re-use the DataValues module(s) that Wikidata uses.
You write in your proposal "Hard to define types like Wikidata ID, datetime, and URL could be stored as a string until we can reuse Wikidata's type system". Well, what's keeping you from using it now? DataValue and friends are standalone composer modules, you can find them on github.
-- daniel