The issue of mirroring Wikimedia content has been discussed with a
number of scholarly institutions engaged in data-rich research, and
the response was generally of the "send us the specs, and we will see
what we can do" kind.
I would be interested in giving this another go if someone could
provide me with those specs, preferably for Wikimedia projects as a
whole as well as broken down by individual projects or languages or
timestamps etc.
The WikiTeam's Commons archive would make for a good test dataset.
Daniel
--
http://www.naturkundemuseum-berlin.de/en/institution/mitarbeiter/mietchen-dā¦
https://en.wikipedia.org/wiki/User:Daniel_Mietchen/Publications
http://okfn.org
http://wikimedia.org
On Fri, Aug 1, 2014 at 4:42 PM, Federico Leva (Nemo) <nemowiki(a)gmail.com> wrote:
WikiTeam[1] has released an update of the
chronological archive of all
Wikimedia Commons files, up to 2013. Now at ~34 TB total.
<https://archive.org/details/wikimediacommons>
I wrote to ā I think ā all the mirrors in the world, but apparently
nobody is interested in such a mass of media apart from the Internet
Archive (and the
mirrorservice.org which took Kiwix).
The solution is simple: take a small bite and preserve a copy yourself.
One slice only takes one click, from your browser to your torrent
client, and typically 20-40 GB on your disk (biggest slice 1400 GB,
smallest 216 MB).
<https://en.wikipedia.org/wiki/User:Emijrp/Wikipedia_Archive#Image_tarballs>
Nemo
P.s.: Please help spread the word everywhere.
[1]
https://github.com/WikiTeam/wikiteam
_______________________________________________
Commons-l mailing list
Commons-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/commons-l