John Grohol
wrote:
The past two dumps of enwiki have failed; last
successful dump was in September.
Any ideas on what might be causing this?
I don't think there's any
mystery. It's not restartable, and there's all
sorts of things that can make the process die. A simple solution would
be to split the dump by page_id, say into 10 files. Then if the process
died, you could restart from the start of that file, rather than the
start of the wiki. If it's too big a hassle for clients to download 10
files instead of 1, then you can always put them in a separate directory
and serve them by anonymous FTP.
-- Tim Starling
Wasn't there talk about something snazzy being added so that if the database
connection went away or a query failed whilst creating
the dump, it would catch an exception, sleep a bit, and keep retrying that operation
until the connection was re-established or
query succeeded?
Yes, and that does in fact work -- if you sit and watch the dump status
you'll sometimes see the message that it disconnected and it waiting to
retry.
This last dump, though, had an unknown problem in the XML skeleton dump
which didn't produce any output error message.
The skeleton dump is usually very reliable, as it doesn't have to sit
there begging external storage servers for data. I'll have to take a
peek at it...
-- brion vibber (brion @