I wrote a couple days ago:
- Progress and error reporting: The old backup script was a hacky shell
script with no error detection or recovery, requireing manually stopping replication on a database server and reconfiguring the wiki cluster for the duration. If something went awry, maybe nobody noticed... the hackiness of this is a large part of why we've never just let it run automatically on a cronjob.
I want to rework this for better automation and to provide useful indications of what it's doing, where it's up to, and if something went wrong.
STATUS: Not yet started. Hope to have done tomorrow or Friday.
A semi-experimental backup run is in progress now.
Try for instance: http://download.wikimedia.org/special/sources/
For your amusement there's a log for each wiki: http://download.wikimedia.org/special/sources/backup.log
There's also now a page dump which excludes user pages and talk pages in addition to the full and all-current sets: http://download.wikimedia.org/special/sources/pages_public.xml.gz
And uploaded files should be included again: http://download.wikimedia.org/special/sources/upload.tar
It's entirely possible that there are still horrible problems and we'll have to run it over, of course. :)
- Clean up download.wikimedia.org further, make use of status files left
by the updated backup runner script.
STATUS: Not yet started. (Doesn't have to be up before the backup starts.)
Haven't gotten to this yet, will try by the end of the weekend if no one else tries their hand.
-- brion vibber (brion @ pobox.com)