On February 23, Brion Vibber wrote about the development of a new dump process:
I've been needing to reprioritize resources for this for a while; all of us having many other things to do at the same time
I don't really see why this should be. Is there still a shortage of developers? What's the plan to fix that?
On February 24, the dumps started to roll again, but only with 3 parallel processes. This has since been reduced to 2 processes. The oldest dump at the bottom of http://download.wikimedia.org/backup-index.html is now from January 19, which is 7 weeks old. It was bad enough when the cycle was 3-4 weeks during November-January.
On February 24, Brion wrote about restarting the current process:
Please note that unlike the wikis sites themselves, dump activity is *not* considered time-critical -- there is no emergency requirement to get them running as soon as possible.
This is a language I don't understand. If they are "not time-critical" (not at all?) that means they could wait 4 weeks or 4 years. So why are you pretending to produce dumps at all, when you could just switch them off for the coming 3 years? Things just can't be "not time-critical". Every activity that is performed needs to be completed, or it shouldn't be performed. Maybe it can wait 4 hours or 4 days, but 4 weeks is painfully slow and 4 months is almost useless.
I need a weekly dump of current pages in order to keep improving Wikipedia. If I get one every 4 weeks, it means I'm working one week and idling for 3 weeks. I could live with that. But during July-October 2008 I didn't get any dumps, because we were waiting for new storage systems to be installed, and right now I'm not getting any either. So was the new storage indeed not the bottleneck?
This will be pushed back to later if we don't see an immediate generation-speed improvement, but it's very much a desired project since it will make the full-history dump files much smaller.
Is size all that important? I need frequent dumps, not smaller dumps.