On Sun, Nov 30, 2008 at 3:35 PM, Michael Peel email@mikepeel.net wrote:
On 30 Nov 2008, at 20:11, Robert Rohde wrote:
On Sun, Nov 30, 2008 at 8:20 AM, Erik Zachte erikzachte@infodisiac.com wrote:
English -> English dump
Because myself and others have been frustrated by the lack of good stats on the number of active editors on the English Wikipedia, I have compiled some stats on the editing frequency on enwiki:
No worries: in only 176 days from now the English dump will be ready and I can run wikistats scripts on it. It just started 52 days ago, so let us be patient for a while ;)
Is there any reason at all to believe that it is more likely to finish this time than all the previous attempts during the last two years?
I have virtually zero faith in a script that takes 230 days and where any error wipes out all progress.
-Robert Rohde
Hold on...what? There is no recent dump of the English Wikipedia, and there hasn't been for the last 2 years?
Please tell me I'm misunderstanding things here.
Mike
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
(cc'd to wikitech-l) I saw this the other day as well and found it odd. While enwiki dumps do take the longest, this does seem like an _incredibly_ long time for "All pages with complete page edit history (.bz2)" to finish (May 2009).
-Chad
I saw this the other day as well and found it odd. While enwiki dumps do take the longest, this does seem like an _incredibly_ long time for "All pages with complete page edit history (.bz2)" to finish (May 2009).
Do you know how many pages enwiki has and how much edit history they each have? It's a lot!
I think the dumps work by starting with the last successful dump and just adding in anything that's changed, but because there haven't been any successful dumps of the whole of enwiki in a long time, it basically has to start from scratch, which is going to take a long time (and means it probably won't succeed - ie. we have a catch-22). It seems to me that (if my understanding of the problem is correct), the answer is to devote a more powerful computer to the dump for just this one so that we can get things moving again - I'm sure if we asked around someone could lend us a really powerful computer for a few weeks to do the dump on.
wikitech-l@lists.wikimedia.org