Re: [Wikitech-l] new external store databases deployed - followup questions

19 Nov 2011


      On Fri, Nov 18, 2011 at 3:41 PM, Ben Hartshorne
bhartshorne@wikimedia.orgwrote:
...
Answering a few questions in one place.
On Fri, Nov 18, 2011 at 10:29 AM, Brion Vibber brion@wikimedia.org
wrote:
...
Hmm... what I'd expect is that if one ES save target database is in
read-only, the system should cycle through to the next available one that
is working -- the save should then succeed transparently.
Do we not have that sort of write failover logic, or are *all* ES
clusters
...
getting locked somehow?
The last step of the maintenance was to switch the master for article
writes from ms3 to es3.  In order to make sure no data is lost during
the transition, I marked the master read-only for the duration of the
switch.  Given that there is only one ES target database to which
writes are sent (currently es3), there is nowhere to which to
failover.  (All slaves run read-only all the time.)
*nod* logical enough. For the future I'd recommend planning a temporary
'holding zone' cluster that would be used only during the changeover -- it
would remain read-write while the main ones are being copied.
Then after switching writes to the new targets, the holding zone can go
read-only while it gets copied over to the new target, which should go
relatively fast.
This would be just another part of the ES system rather than a separate
cache, so should remain reasonably robust: if something goes awry with the
main copy to the new clusters, you can safely stop: the holding zone will
just sits with the old servers and can just keep running like the other ES
clusters, unlike some sort of cache which might lose data.
-- brion

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] new external store databases deployed - followup questions