On 19/04/13 01:19, DaB. wrote:
as you may know there is a rev_text_id-field in the revision-table. This field points to the text-table where the actual text is – or should be. Because the WMF doesn’t store the text here, but only a pointer ("DB://cluster25/11458305" for example). If you query different wikis you will see that most of them point to the same cluster or one with a number short by. That says me (and I was also told so before) that all text of all wmf-projects are stored together. The task would now to separate wikidata from the rest – but the storage-area has no clue from where a text is which makes the separating very hard. And there is another problem: Deleted texts are also in this area, so even more filtering would be needed. I very doubt that this situation will change at the TS and I also doubt that it will be different for WikiLabs. So I guess your best bet is the API here.
Sincerely, DaB.
I think the only hope would be if wikidata was stored under its own cluster (for easier differenciation) and at least one server of that group (the master?) only had that (so toolserver could get its binlogs).