While no expert, I'll try to clarify what I can.
Nathan J. Yoder wrote:
I know that LiveJournal has some sort of live backup
system using
MySQL and Perl, but couldn't find any details on their presentations.
You might be able to ask one of their developers for help, on their LJ
blog. Can Wikimedia afford a snapshot server? It doesn't need to be as
fast as the others.
In the long run, whatever this system is, it will probably need to be
integrated into some sort of backup, because it would be a huge pain
if something happened at the data center and you needed to restore
from the partial quasi-backups in the current systems.
The databases are replicated.
So if the master db died, recovering would just be a matter of flipping
a switch to promote a slave to master. (Meantime the sites would be
read-only)
Even if the exploded, the databases are replicated at Europe
(that may not include private data, can someone shed a light if they are
replicated at Europe, the office or nowhere?).
How does the current dump method work? Are they
incremental in the
sense that they build up on previous dumps, instead of re-dumping all
of the data?
Yes. I don't know what was changed by Tomasz, but I doubt he modified
that. The new dumps read the latest one and query the mysql ES servers
just for new page content.
For future dumps, we might have to resort to some form
of snapshot
server that is fed all updates either from memcaches or mysqls. This
allows for a live backup to be performed, so it's useful for not just
dumps.
I don't see how memcaches can be used.
A server being feed from mysqls is just one of mysql slaves.
Thus that's available now.
Is it possible to suspend any individual slaves
temporarily during off
peak hours to flush the database to disk and then copy the database
files to another computer? If not, we may still be able to use a
"stale" database files copied to another computer, as long as we only
use data from it that is at least a few days old, so we know that it's
been flushed to disk (not sure how mysql flushes the data...).
Sure. But database files can't be published (they contain private data)
so it's only good for internal backup.
Of course, this may all be totally off, since I
don't know a lot about
the current configuration and issues, so I'll take whatever input you
have to help work on something better.
Asking is the first step :)