[Labs-l] Database replica outage

Marc A. Pelletier marc at uberbox.org
Thu Sep 19 01:29:03 UTC 2013


On 09/18/2013 05:49 PM, Platonides wrote:
> "We don't need multiple replication". It may not be cheap, practical 
> or comfortable. But the facts only allowed the sentence to stand a 
> couple of months :P

All told, it took us just a hair under four hours to get replication 
back up from scratch (which is pretty much a worse case scenario). That 
seems like a reasonable compromise to me.

Plus, we even identified a few points during the process where we could 
shave some time off so if it has to occur again, we'd be even faster to 
recover.

That said, I'm going to recommend adding a fourth database to serve as a 
warm swap host, meaning that it'd be easier to recover even if the 
hardware failure was difficult to recover from.  It's not as cool as 
having full redundancy but it's the best bang for limited bucks we can get.

-- Marc




More information about the Labs-l mailing list