[Foundation-l] EN Wikipedia Editing Statistics

Thomas Dalton thomas.dalton at gmail.com
Sun Nov 30 20:58:36 UTC 2008


> I saw this the other day as well and found it odd. While enwiki dumps
> do take the longest, this does seem like an _incredibly_ long time for
> "All pages with complete page edit history (.bz2)" to finish (May 2009).

Do you know how many pages enwiki has and how much edit history they
each have? It's a lot!

I think the dumps work by starting with the last successful dump and
just adding in anything that's changed, but because there haven't been
any successful dumps of the whole of enwiki in a long time, it
basically has to start from scratch, which is going to take a long
time (and means it probably won't succeed - ie. we have a catch-22).
It seems to me that (if my understanding of the problem is correct),
the answer is to devote a more powerful computer to the dump for just
this one so that we can get things moving again - I'm sure if we asked
around someone could lend us a really powerful computer for a few
weeks to do the dump on.



More information about the foundation-l mailing list