FWIW, I have implemented a query-able stand-alone web server that keeps all of the wikidata property-item-links in memory. This uses the wikidata dumps which appear to be rather frequent. I'll try do deploy a test version on wikilabs (once I figure out how all that works); it seems to be more favourable to such services than the toolserver.


On Fri, Apr 19, 2013 at 9:29 AM, Platonides <platonides@gmail.com> wrote:
On 19/04/13 01:19, DaB. wrote:
> as you may know there is a rev_text_id-field in the revision-table. This field
> points to the text-table where the actual text is – or should be. Because the
> WMF doesn’t store the text here, but only a pointer ("DB://cluster25/11458305"
> for example). If you query different wikis you will see that most of them point
> to the same cluster or one with a number short by. That says me (and I was
> also told so before) that all text of all wmf-projects are stored together.
> The task would now to separate wikidata from the rest – but the storage-area
> has no clue from where a text is which makes the separating very hard. And
> there is another problem: Deleted texts are also in this area, so even more
> filtering would be needed.
> I very doubt that this situation will change at the TS and I also doubt that
> it will be different for WikiLabs. So I guess your best bet is the API here.
>
> Sincerely,
> DaB.

I think the only hope would be if wikidata was stored under its own
cluster (for easier differenciation) and at least one server of that
group (the master?) only had that (so toolserver could get its binlogs).

_______________________________________________
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: https://wiki.toolserver.org/view/Mailing_list_etiquette