The whole backup process is automated, but I start the script manually
because we're short on disk space and I prefer to keep an eye on it.
I'm confident Jimbo would consider adding diskspace a minor investement, provided we don't use anything fancy (raid, SCSI) (Jimbo am I right?). I guess bandwidth costs are the main expense and several orders of magnitude larger.
I know that 18 days without a backup is a rare event, but even losing 6 days of edits, uploads and discussions would give me the shivers. Are they stored on a different disk by the way? If backup is a matter of minutes per database, then the benefits of a scheduled daily outage far exceed the costs, it will even save some bandwidth ;)
Backups are an inconvenience until you need them.
Erik Zachte
On Fri, Aug 01, 2003 at 04:13:10PM +0200, Erik Zachte wrote:
The whole backup process is automated, but I start the script manually
because we're short on disk space and I prefer to keep an eye on it.
I'm confident Jimbo would consider adding diskspace a minor investement, provided we don't use anything fancy (raid, SCSI) (Jimbo am I right?). I guess bandwidth costs are the main expense and several orders of magnitude larger.
I know that 18 days without a backup is a rare event, but even losing 6 days of edits, uploads and discussions would give me the shivers. Are they stored on a different disk by the way? If backup is a matter of minutes per database, then the benefits of a scheduled daily outage far exceed the costs, it will even save some bandwidth ;)
Backups are an inconvenience until you need them.
Erik Zachte
If database dumps are cheap, and can be done "on the fly" with MySQL, it might be a reasonable plan to just throw in an extra disk and have it dump the database every six hours or so. At the very least, set up a wget statement on larousse to have it download the pliny nightly database dumps...
Nick Reinking wrote:
If database dumps are cheap, and can be done "on the fly" with MySQL, it might be a reasonable plan to just throw in an extra disk and have it dump the database every six hours or so.
Running the dump slows things down a lot. Partly because compressing the dump is expensive, and there's no disk space to keep an uncompressed dump. Also, unless we lock the wiki the cur and old databases will be inconsistent in the backup. (Which is in fact the present state of the backups for the English wiki.)
Those wikis not moved to InnoDB would additionally be unable to edit during a backup, as writes to the tables are locked while they're being read.
Of course, a replicated server would be updated in real time at much lower performance cost.
-- brion vibber (brion @ pobox.com)
Erik Zachte wrote:
I'm confident Jimbo would consider adding diskspace a minor investement, provided we don't use anything fancy (raid, SCSI) (Jimbo am I right?). I guess bandwidth costs are the main expense and several orders of magnitude larger.
The drive in the machine now is SCSI, and that's what I will add.
Bandwidth costs are minor, as far as I'm concerned. I've never even bothered to break it out from my overall bandwidth bill. People think Wikipedia is popular, and it is in a sense, but on a bandwidth basis, it's a small part of what we do overall.
--Jimbo
wikitech-l@lists.wikimedia.org