"Brion Vibber" <brion(a)wikimedia.org> wrote in
message news:46B1773E.6060608@wikimedia.org...
Kinda? That's a 100% error! MySQL's count()
function is worse than I
thought... how much less efficient is the accurate method? Could we
cache the accurate value every 10 minutes or so and display that?
The issue here is that COUNT(*) is an horrendous operation on a large
InnoDB table, owing to the fact that the value isn't stored, and has
to be recalculated on demand. The estimate comes from running a quick
EXPLAIN SELECT and represents the number of rows the engine believes
might be involved, offhand.
Unfortunately that result is wildly inaccurate, but is reported as an
exact integer down to the ones place. :)
It might not hurt to do some basic approximations here: do the EXPLAIN
estimate, then if it's a smallish value, go ahead and grab an exact count.
If it's a very large value, then take the estimated count *and pass back
the fact that it's an estimate* and display accordingly.
I've long advocated removing this from the
Special:Statistics page;
it's not information which has relevance to the site, and users seem
to use it as a metric for all sorts of things, when it's not.
What would be more useful would be a historical graph showing the rise
and fall of the queue size. That would help people to avoid freaking out
when it's briefly large due to activity.
Can't we use the site_stats table. Have a new column 'ss_job_queue_length'
which is populated with the current correct value in the rebuildall script,
and then update this value whenever jobs are added or processed? Or are the
figures in this table approximations as well?
- Mark Clements (HappyDog)