Why not provide a fourth host and have those three
servers act as slaves
of it?
While that is a reasonable way to do it- that is exactly how we do it in
production; that is almost impossible to do it for all users.
First, blind read-write split doesn't work; it needs to be curated for each
application, and we do not have control over when reads and writes happen;
so in the case there is lag, most applications will break. Now I will
explain why always there would be lag-
Second, it creates a SPOF on replication, for each individual application.
While 90% of the data services users do reasonable queries and data
structure; a minority do extremely fast imports, create tables on MyISAM
format (which corrupts on every crash) or do other things that would make
the replication lag or break. **That means, because a single user does
heavy writes, all users, including of simple production replica reads would
be affected**. Also, being able to create tables and perform writes on the
wikireplicas caused lots of complains from other users as they often locked
the wiki data, causing lag. As Bryan says, we are open to other methods
(e.g. we could pregenerate/import summary tables from production), but
those should be puppetized and automatized and checked for sanity (InnoDB,
no blocking/breakage of replication, etc.). Creating summary tables through
replication means people can share that data instead of it being replicated
multiple times. Those are not theoretical, we had to blacklist some
databases from replication on toolsdb because they were causing similar
replication issues:
https://phabricator.wikimedia.org/T127164
Another thing we could do is to experiment with FEDERATED/CONECTX tables
(make virtual tables from wikireplicas available from toolsdb), something
that was done in the past, but it has the same problem of userspace joins-
its performance is not perfect.
It is important to note that to take this decision, we checked existing
users of user tables on wikireplicas and saw that a) the number of users of
local databases was much lower than the total users b) in most cases, they
were just used for summary tables INSERT...SELECT (which by the way, was
the main cause of lag on the replica servers), and is trivial to migrate to
SELECT -> user space -> INSERT c) in those cases where actual JOINs are
used, which we believe to be very few, we are open for solutions like the
ones commented above.
Moving to tools db will mean more available resources in most cases, and a
better availability for both the replicas and the user dbs, as if the
master hardware crashes, there is a passive replica (unlike on local
tables).
On Thu, Oct 19, 2017 at 4:01 AM, Platonides <platonides(a)gmail.com> wrote:
> > It is not a happy thing for us to force anyone to change their
> > software, but as explained in the wiki page [0] we can not find a
> > reliable method to ensure that the same user created tables are
> > available on all three of the new backend servers
>
Why not provide a fourth host and have those three
servers act as slaves
> of it?
>
> Writes go to the first one, but reads and joins can go to the replicas.
>
> I'm afraid that "make the JOINs in user space" may end up on some
> cases with the app fetching all the rows of the tool table or the wiki
> replica in order to perform a JOIN that used to be straightforward.
> And the tools aren't really the place to implement all kinds of
> partial-fetch implementations and query estimates (without EXPLAIN,
> even!). Unless you know of such an efficient user space JOIN
> implementation, perhaps?
>
> Regards
>
> _______________________________________________
> Wikimedia Cloud Services mailing list
> Cloud(a)lists.wikimedia.org (formerly labs-l(a)lists.wikimedia.org)
>
https://lists.wikimedia.org/mailman/listinfo/cloud
>
--
Jaime Crespo
<http://wikimedia.org>