Hi, I am thinking about how to collect articles deleted based on the "not notable" criteria, is there any way we can extract them from the mysql binlogs? how are these mirrors working? I would be interested in setting up a mirror of deleted data, at least that which is not spam/vandalism based on tags. mike
On Thu, May 17, 2012 at 1:09 PM, Ariel T. Glenn ariel@wikimedia.org wrote:
We now have three mirror sites, yay! The full list is linked to from http://dumps.wikimedia.org/ and is also available at http://meta.wikimedia.org/wiki/Mirroring_Wikimedia_project_XML_dumps#Current...
Summarizing, we have:
C3L (Brazil) with the last 5 good known dumps, Masaryk University (Czech Republic) with the last 5 known good dumps, Your.org (USA) with the complete archive of dumps, and
for the latest version of uploaded media, Your.org with http/ftp/rsync access.
Thanks to Carlos, Kevin and Yenya respectively at the above sites for volunteering space, time and effort to make this happen.
As people noticed earlier, a series of media tarballs per-project (excluding commons) is being generated. As soon as the first run of these is complete we'll announce its location and start generating them on a semi-regular basis.
As we've been getting the bugs out of the mirroring setup, it is getting easier to add new locations. Know anyone interested? Please let us know; we would love to have them.
Ariel
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l