All of the errors occurred on writes to the user
database.
That is strange, because while "enwiki.analytics.db.svc.eqiad.wmflabs" is
"new" because it is served by new sets of servers and has been upgraded
recently, plus it is being tuned; toolsdb has not been touched I think for
a couple a weeks, when it was upgraded, plus at the time it is not handled
by a proxy.
Do you use connection pooling/persistent connections? That is not allowed
[2], but more important than that, it may create connection problems if a
server fails over automatically, because it will keep pointing to the wrong
server.
There was not an overload on toolsdb last week that could explain the extra
writing load: [0]. There was one overload, however, on labsdb1009
(analytics) during the weekend, which lead to me baning/throttling and
notifying several users as they had created a denial of service: [1]
Notice one big change on the new servers (analytics and web) is that right
now there is no query limitation- if some user runs 10 long-running
queries, they can and that could affect other users, I have not limited
that except on per user issues- if the community wants to agree and set up
some, I can do that with no problem, but now that we are not so
resource-bound I did not want to introduce artificial limitations as some
people didn't like the limits on the old servers because of lower resources.
[0] <
https://grafana.wikimedia.org/dashboard/file/server-board.json?refresh=1m&a…
[1]
<https://phabricator.wikimedia.org/T182997
[2] <
https://wikitech.wikimedia.org/wiki/Help:Toolforge/Database#Connection_hand…
This sounds like something that is worth of opening a Phabricator task
> about. We do have an existing ticket
> (<https://phabricator.wikimedia.org/T180380>) that may also be somehow
> related depending on where the disconnects are happening.
Please share details of connection (user, code, timestamps) on a
phabricator task- maybe there is a slowdown on toolsdb we have not yet
realized. That way we can have a deeper look.
--
Jaime Crespo
<http://wikimedia.org