On Sun, Dec 17, 2017 at 9:44 AM, Russell Blau <russblau(a)imapmail.org> wrote:
One of tools.dplbot's daily tasks has been having
repeated problems
since yesterday. A script that ran without errors and completed in about
10 minutes on Friday ran for over 90 minutes on Saturday, and died with
a "MySQL server has gone away" error. There were no edits to the script
in between Friday and Saturday, so I have to assume that something
changed on the server side.
The script reads from enwiki.analytics.db.svc.eqiad.wmflabs, and both
reads from and writes to tools.labsdb. All of the errors occurred on
writes to the user database. I was able to work around the errors by
dropping the database connection and opening a new one immediately
before writing (I have no idea why this works, since the timeout setting
on the database for inactive connections is 8 hours, and this script was
not even running for two hours; but it did work). However, the script
continues to run for an order of magnitude longer than it did on Friday
(~100 minutes vs. ~10 minutes). Is anyone else experiencing similar
issues?
Can you determine if the increased runtime is from reading data from
the enwiki side or from writing to the toolsdb side?
This sounds like something that is worth of opening a Phabricator task
about. We do have an existing ticket
(<https://phabricator.wikimedia.org/T180380>) that may also be somehow
related depending on where the disconnects are happening.
Bryan
--
Bryan Davis Wikimedia Foundation <bd808(a)wikimedia.org>
[[m:User:BDavis_(WMF)]] Manager, Cloud Services Boise, ID USA
irc: bd808 v:415.839.6885 x6855