Platonides wrote:
"Erik Zachte" wrote:
I proposed doing the largest dumps in incremental steps (say one job per letter of the alphabet and concat at the end), so that rerun after error would be less costly but Brion says there are no disk resources for that
Why not? 26 files of 1/26 of the db would fill the same as a full dump.
If you were to concatenate multiple bits in a single stream, it would either take a lot more disk space or you'd increase the run time by a few days to recompress everything.
Really though multiple chunks have less to do with disk space, than simply being more or less infinitely harder to manage and work with.
Actual improvements underway include fixing up the text dump runner to recover from database disconnection (the most common problem), made possible by the switch to PHP 5 and catchable exceptions for errors instead of having the script die out.
The next run of each wiki _should_ now be able to recover from disconnected or temporarily overloaded databases.
-- brion vibber (brion @ pobox.com)