[Wikimedia-l] Fire Drill Re: Wikimedia sites not easy to archive (Was Re: Knol is closing tomorrow )

emijrp emijrp at gmail.com
Fri May 18 09:41:08 UTC 2012


There is no such 10GB limit,
http://archive.org/details/ARCHIVETEAM-YV-6360017-6399947 (238 GB example)

ArchiveTeam/WikiTeam is uploading some dumps to Internet Archive, if you
want to join the effort use the mailing list
https://groups.google.com/group/wikiteam-discuss to avoid wasting resources.

2012/5/18 Mike Dupont <jamesmikedupont at googlemail.com>

> Hello People,
> I have completed my first set in uploading the osm/fosm dataset (350gb
> unpacked) to archive.org
> http://osmopenlayers.blogspot.de/2012/05/upload-finished.html
>
> We can do something similar with wikipedia, the bucket size of
> archive.org is 10gb, we need to split up the data in a way that it is
> useful. I have done this by putting each object on one line and each
> file contains the full data records and the parts that belong to the
> previous block and next block, so you are able to process the blocks
> almost stand alone.
>
> mike
>
> _______________________________________________
> Wikimedia-l mailing list
> Wikimedia-l at lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
>



-- 
Emilio J. Rodríguez-Posada. E-mail: emijrp AT gmail DOT com
Pre-doctoral student at the University of Cádiz (Spain)
Projects: AVBOT <http://code.google.com/p/avbot/> |
StatMediaWiki<http://statmediawiki.forja.rediris.es>
| WikiEvidens <http://code.google.com/p/wikievidens/> |
WikiPapers<http://wikipapers.referata.com>
| WikiTeam <http://code.google.com/p/wikiteam/>
Personal website: https://sites.google.com/site/emijrp/


More information about the Wikimedia-l mailing list