FWIW, I have implemented a query-able stand-alone web server that keeps all
of the wikidata property-item-links in memory. This uses the wikidata dumps
which appear to be rather frequent. I'll try do deploy a test version on
wikilabs (once I figure out how all that works); it seems to be more
favourable to such services than the toolserver.
On Fri, Apr 19, 2013 at 9:29 AM, Platonides <platonides(a)gmail.com> wrote:
On 19/04/13 01:19, DaB. wrote:
as you may know there is a rev_text_id-field in
the revision-table. This
field
points to the text-table where the actual text is
– or should be.
Because the
WMF doesn’t store the text here, but only a
pointer
("DB://cluster25/11458305"
for example). If you query different wikis you
will see that most of
them point
to the same cluster or one with a number short
by. That says me (and I
was
also told so before) that all text of all
wmf-projects are stored
together.
The task would now to separate wikidata from the
rest – but the
storage-area
has no clue from where a text is which makes the
separating very hard.
And
there is another problem: Deleted texts are also
in this area, so even
more
filtering would be needed.
I very doubt that this situation will change at the TS and I also doubt
that
it will be different for WikiLabs. So I guess
your best bet is the API
here.
Sincerely,
DaB.
I think the only hope would be if wikidata was stored under its own
cluster (for easier differenciation) and at least one server of that
group (the master?) only had that (so toolserver could get its binlogs).
_______________________________________________
Toolserver-l mailing list (Toolserver-l(a)lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list:
https://wiki.toolserver.org/view/Mailing_list_etiquette