Brion Vibber wrote:
I managed to catch one of our little database-stuck spells (where a bunch of db threads get stuck in "statistics state") and got vmstat running on it.
The notable things are that the number of processes in uninterruptible sleep (the 'b' column) is bigger than usual (0-1 with an occasional 2 is normal), a few processes are even swapped out ('w' column), and i/o has gone all wonky; some unusually low input ("bi") and very unusually high output ("bo").
There's nothing reported about SCSII oddities in the system logs, nor anything in the mysql log. It may be something as mundane as the flushing of journal buffers for the filesystem or the database, in which case some tweaking may be possible, or there may be darker forces at work (mwoo-ha-ha-haaaaa!) Apache seems to have died around this time as well, apparently a failed restart. Apache is set to restart on pliny every hour at 34 minutes to keep a memory leak from getting too bad. It's possible that the exiting Apache freed up a big hole in memory that other things suddenly tried to fill? I don't know.
If bo is high, it means it is reading from the disk, correct?
By adding up that column and multiplying by 5, I get a total quantity of about 1300MB. Which is a lot.
I had another look at the log from my monitor program which was recorded before the system crash at ~4:30, 13 Oct. I have output from SHOW PROCESSLIST every 5 seconds for the 5 minute duration of the event. The only interesting feature I've picked out of it yet is that for the entire 5 minutes, there was a query in the "removing tmp table" state. It was definitely one query, because the thread ID was constant. It's a watchlist query:
$sql = "SELECT cur_namespace,cur_title,cur_comment, cur_user,cur_user_text,cur_timestamp,cur_minor_edit,cur_is_new FROM watchlist,cur USE INDEX ($x) WHERE wl_user=$uid AND $z AND wl_title=cur_title $docutoff ORDER BY cur_timestamp DESC";
I don't know which of the two indices it was using.
Can I have a copy of the slow query log? I think it's in the /usr/local/mysql/var directory, which I don't have access to.
-- Tim Starling.