2009/2/16 Aryeh Gregor <Simetrical+wikilist(a)gmail.com>om>:
On Mon, Feb 16, 2009 at 9:18 AM, Tim Starling
<tstarling(a)wikimedia.org> wrote:
I've deleted all the slow refreshLinks2 jobs
which have apparently been
preventing the job queue from making any headway for the last few months.
Some people report that they have received hundreds of edit notification
emails in the last few hours, due to the months of backlog now being cleared.
So are there no alarm bells that go off when the job queue is
unreasonably long, or do people just not listen to them? Perhaps we
could have a bot in #wikimedia-tech that would complain every hour if
the oldest job in the queue is more than X days old?
Alternatively, the number of jobs processed per request could be made
a function of the length of the backlog (in terms of time) - the
longer the backlog is, the faster we process jobs. Then if the job
queue get to being months behind we would all notice it because
everything would start running really slowly. (Obviously, the length
of the job queue needs to be added to whatever diagnostic screen the
devs first check when the site slows down, otherwise it won't help
much.)