Query killer heuristics - Toolserver-l

27 Aug 2011


      ...
the linked wiki-page was updated with the information about the
query-killer
more than 1 year ago.
Sincerly,
DaB.
On my view the query killer is indeed a very nice function to ballance
between toolserver's force shared to user tools and its (toolserver's)
stable operation and overall performance. I just thought once it previously
worked in a less restricting way, there was a good reason for higher
restrictions just introduced.
Let's take an example of killer's behaviour, which is partially could be
considered helpfull and partially questionable:
...
a MySQL-query of yours was killed because you didn't mark it as SLOW_OK
and it have run for 3612 seconds which was longer than allowed.
I see this as a very helpfull part, which recommends to think of the query
optimization or at least on the magic word.
Next,
...
The replication lag at kill-time was 0s.
and the query itself starts with
...
DELETE l
                  FROM l,
                       a2cr
                  WHERE l_to=a2cr_to
Each table is local to a user database and used exclusively by a single
(TRANSACTION ISOLATION LEVEL READ UNCOMMITTED) thread. This query should not
cause any other queries wating for its completion, and as far as replag is
zero at the execution time it does not cause any toolserver's resource being
unavailable or limited in requested capacity to other users.
Previously the query killer interrupted queries just in case of replag upper
some thresold, which is a bit less restrictive condition, but seems not yet
the very proper one. It killed all the queries worked for more then allowed
and (looked like) did not undertake any smarter analysis on which queries
are the most influencing or causing the replag increased.
I assume once the query killer started looking for replag more regularely it
could be in a position to mark queries working for long and not causing more
or less stable replag increase as more or less safe and wait longer prior to
kill them.
mashiah