2011/9/21 emijrp <emijrp(a)gmail.com>om>:
Hi all;
Just like the scripts to preserve wikis[1], I'm working in a new script
to
download all Wikimedia Commons images packed by
day. But I have limited
spare time. Sad that volunteers have to do this without any help from
Wikimedia Foundation.
I started too an effort in meta: (with low activity) to mirror XML
dumps.[2]
If you know about universities or research groups
which works with
Wiki[pm]edia XML dumps, they would be a possible successful target to
mirror
them.
If you want to download the texts into your PC, you only need 100GB free
and
to run this Python script.[3]
I heard that Internet Archive saves XML dumps quarterly or so, but no
official announcement. Also, I heard about Library of Congress wanting to
mirror the dumps, but not news since a long time.
L'Encyclopédie has an "uptime"[4] of 260 years[5] and growing. Will
Wiki[pm]edia projects reach that?
Regards,
emijrp
[1]
http://code.google.com/p/wikiteam/
[2]
http://meta.wikimedia.org/wiki/Mirroring_Wikimedia_project_XML_dumps
[3]
http://code.google.com/p/wikiteam/source/browse/trunk/wikipediadownloader.py
Hi emirjrp,
I can understand why you would prefer to have "full mirrors" of the
dumps, but let's face it, 10TB is not (yet) something that most
companies/universities can easily spare. Also, most people only work
on 1-5 versions of Wikipedia, the rest is just overhead to them.
My suggestion would be to accept mirrors of a single language and have
a smart interface at
dumps.wikimedia.org that redirects requests to
the location that is the best match for the user. This system is used
by some Linux distributions (see
download.opensuse.org for instance)
with great success.
Regards,
Strainu
_______________________________________________
foundation-l mailing list
foundation-l(a)lists.wikimedia.org
Unsubscribe:
https://lists.wikimedia.org/mailman/listinfo/foundation-l
Perhaps a torrent setup would be successful in this case.
--
Brian Mingus
Graduate student
Computational Cognitive Neuroscience Lab
University of Colorado at Boulder