Tim Starling wrote:
Magnus Manske wrote:
<data> <query database="wikispecies" result="r1">Some sort of XQuery or SQL query for wikispecies for {{{1}}}</query> Some species data table using <r1>latin_name</r1>, <r1>name_en</r1>, <r1>family</r1> etc. </data>
My approach would be to not use SQL, or anything similar. Use a custom syntax with a greatly restricted feature set. Think in terms of applications. Only allow queries which can be cached and invalidated. Fetching single rows would be a good place to start, that's all I would have implemented if I followed my WikiDB idea.
Reducing the possible queries would simplify things. But, I was under the impression that more is demanded from a WikiData system, and we should probably get this right from the start. But, I am not insisting on SQL or anything. It just seemed the natural choice for me, besides XQueries.
Cache invalidation or purging is the standard solution here. Make a list of every article which fetches a particular row, and update it on edit. Then when the row changes, invalidate all the articles in the list.
No problem here.
Make a list of every article which contains a list of species in the Foobus family. Invalidate all articles in the list every time a species is added or removed from that family.
Now *that* requires that we keep * the original query for each article, and its results * rerun that query every time an (*any*) entry has changed or was added, and compare it to the original results
Also, that works only if we have, for example, all the species data in one table. Like, kingdom, phylum, class, order, family, genus. If we, instead, decide to have one table for species which contains only the genus, then another table for the genus which, apart from information about the genus in general, contains the order, etc., then this will become a problem. *Theoretically*, an order could be moved from one subclass to another. Now all species in a genus in a family in a suborder in that order needs to be updated. Good luck with that.
It's disappointing to give up on some of the dream, but at some stage of the development process, you have to be realistic. My advice would be to set a short term goal (a few months or so), code something useful, admire your work, then go from there.
If there were consensus to limit WikiData to all but the most simple queries ("... WHERE name='Foobus'"), and to give up on instant updates and just clear the cache once in a while to update data in articles, something can be done. Otherwise, I'll stay clear on this one, unless it turns out there's something obvious I missed.
Magnus