[Mediawiki-l] Sending DB reads to master if slave replication fails?

Platonides Platonides at gmail.com
Sun Apr 10 00:27:09 UTC 2011

Jani Patokallio wrote:
> Thanks for the fast and authoritative reply!
> Brion Vibber <brion at pobox.com> wrote:
>> It sounds like you mainly want to use replication to maintain a hot standby
>> database, not for load balancing. It may actually be best here to not tell
>> MediaWiki about the slave at all: use the master only, and consider using
>> some other HA proxy or whatever to hot-swap the old slave in for the master
>> if the master stops responding.
> So the slave becomes the master and starts accepting writes?  Doesn't
> this imply that the direction of replication on the MySQL level has to
> be reversed after the former master comes back online, and now has to
> become the slave to get any changes that were made in the meantime?
> This sounds fairly painful, especially if the failure is intermittent.
>  Alternatively, if you're saying that the slave should be read-only
> even when it takes over, then this isn't really much of an improvement
> on the current state of affairs.

Yes. If you make promote a slave to a master, the master would then need
to be converted into a slave. Even worse, it is possible that some old
commits were into the old master but not replicated into the slave
(that's typical when the binlog disk gets full). So the data of the old
master is wrong and you need now to reimport it.

>> Very early on we did try to fall back to master if no non-lagged slaves were
>> available, however it can be highly problematic if you're using replication
>> for the purpose of load balancing -- which is what MediaWiki's explicit
>> support for multiple database servers is designed for.
> Problem is, while our system does occasionally get spikes of load
> usually involving heavy reads of swathes of the database (which is why
> we've got the master/slave split), the replication typically fails
> randomly for some reason other than load: network glitches, out of
> disk space, etc.  So it's not that lag is high, it's that replication
> has failed entirely.
> Anyway, I gather that the best thing to do is still set max lag, since
> at least this way the non-replicating slave switches MediaWiki into
> read-only mode and the user gets a clear failure message instead of
> just wondering why their edits seem to disappear into the ether.

Yes, seems that the solution Brion suggested is not the appropiate one
for your setup.

More information about the MediaWiki-l mailing list