[Foundation-l] [Xmldatadumps-l] Wikipedia dumps downloader

emijrp emijrp at gmail.com
Mon Jun 27 11:07:51 UTC 2011


Hi Richard;

Yes, a distributed project would be probably the best solution, but it is
not easy to develop, unless you use a library like bittorrent, or similar
and you have many peers. Althought most of the people don't seed the files
long time, so sometimes is better to depend on a few committed persons than
a big but ephemeral crowd.

Regards,
emijrp

2011/6/26 Richard Farmbrough <richard at farmbrough.co.uk>

> **
> It would be useful to have  an archive of archives.  I have to delete my
> old data dumps as time passes, for space reasons, however a team could,
> between them, maintain multiple copies of every data dump. This would make a
> nice distributed project.
>
> On 26/06/2011 13:53, emijrp wrote:
>
> Hi all;
>
> Can you imagine a day when Wikipedia is added to this list?[1]
>
> WikiTeam have developed a script[2] to download all the Wikipedia dumps
> (and her sister projects) from dumps.wikimedia.org. It sorts in folders
> and checks md5sum. It only works on Linux (it uses wget).
>
> You will need about 100GB to download all the 7z files.
>
> Save our memory.
>
> Regards,
> emijrp
>
> [1] http://en.wikipedia.org/wiki/Destruction_of_libraries
> [2]
> http://code.google.com/p/wikiteam/source/browse/trunk/wikipediadownloader.py
>
>
> _______________________________________________
> Xmldatadumps-l mailing listXmldatadumps-l at lists.wikimedia.orghttps://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l
>
>
>


More information about the foundation-l mailing list