[Foundation-l] Wikistats is back
Robert Rohde
rarohde at gmail.com
Thu Dec 25 01:46:27 UTC 2008
On Wed, Dec 24, 2008 at 4:09 PM, Brian <Brian.Mingus at colorado.edu> wrote:
> Interesting. I realize that the dump is extremely large, but if 7zip is
> really the bottleneck then to me the solutions are straightforward:
>
> 1. Offer an uncompressed version of the dump for download. Bandwidth is
> cheap and downloads can be resumed, unlike this dump process
> 2. The WMF offers a service whereby the mail the uncompressed dump to you on
> a hard drive. You pay for the drive and a service charge.
I would estimate a complete, uncompressed enwiki dump in the present
format at ~3 TB in size. ruwiki, which has about 5% as many revisions
as enwiki, has a 187 GB uncompressed dump.
At 3 TB, virtually any mechanism of distributing an uncompressed dump
would be very problematic.
7zip currently achieves greater than 99% size reduction.
-Robert Rohde
More information about the foundation-l
mailing list