Daniel Mietchen, 03/08/2014 03:57:
The issue of mirroring Wikimedia content has been discussed with a number of scholarly institutions engaged in data-rich research, and the response was generally of the "send us the specs, and we will see what we can do" kind.
I would be interested in giving this another go if someone could provide me with those specs, preferably for Wikimedia projects as a whole as well as broken down by individual projects or languages or timestamps etc.
The WikiTeam's Commons archive would make for a good test dataset.
Ariel keeps https://meta.wikimedia.org/wiki/Mirroring_Wikimedia_project_XML_dumps#Requir... up to date. Anything else needed?
Nemo