Tim Starling wrote:
Magnus Manske wrote:
<data>
<query database="wikispecies" result="r1">Some sort of XQuery
or SQL
query for wikispecies for {{{1}}}</query>
Some species data table using <r1>latin_name</r1>,
<r1>name_en</r1>,
<r1>family</r1> etc.
</data>
My approach would be to not use SQL, or anything similar. Use a custom
syntax with a greatly restricted feature set. Think in terms of
applications. Only allow queries which can be cached and invalidated.
Fetching single rows would be a good place to start, that's all I
would have implemented if I followed my WikiDB idea.
Reducing the possible queries would simplify things. But, I was under
the impression that more is demanded from a WikiData system, and we
should probably get this right from the start.
But, I am not insisting on SQL or anything. It just seemed the natural
choice for me, besides XQueries.
Cache invalidation or purging is the standard solution
here. Make a
list of every article which fetches a particular row, and update it on
edit. Then when the row changes, invalidate all the articles in the list.
No problem here.
Make a list of every article which contains a list of
species in the
Foobus family. Invalidate all articles in the list every time a
species is added or removed from that family.
Now *that* requires that we keep
* the original query for each article, and its results
* rerun that query every time an (*any*) entry has changed or was added,
and compare it to the original results
Also, that works only if we have, for example, all the species data in
one table. Like, kingdom, phylum, class, order, family, genus.
If we, instead, decide to have one table for species which contains only
the genus, then another table for the genus which, apart from
information about the genus in general, contains the order, etc., then
this will become a problem. *Theoretically*, an order could be moved
from one subclass to another. Now all species in a genus in a family in
a suborder in that order needs to be updated. Good luck with that.
It's disappointing to give up on some of the
dream, but at some stage
of the development process, you have to be realistic. My advice would
be to set a short term goal (a few months or so), code something
useful, admire your work, then go from there.
If there were consensus to limit WikiData to all but the most simple
queries ("... WHERE name='Foobus'"), and to give up on instant updates
and just clear the cache once in a while to update data in articles,
something can be done.
Otherwise, I'll stay clear on this one, unless it turns out there's
something obvious I missed.
Magnus