"Brion Vibber" brion@wikimedia.org wrote in message news:46B1773E.6060608@wikimedia.org...
Rob Church wrote:
On 02/08/07, Thomas Dalton
thomas.dalton@gmail.com wrote:
Kinda? That's a 100% error! MySQL's count() function is worse than I thought... how much less efficient is the accurate method? Could we cache the accurate value every 10 minutes or so and display that?
The issue here is that COUNT(*) is an horrendous operation on a large InnoDB table, owing to the fact that the value isn't stored, and has to be recalculated on demand. The estimate comes from running a quick EXPLAIN SELECT and represents the number of rows the engine believes might be involved, offhand.
Unfortunately that result is wildly inaccurate, but is reported as an exact integer down to the ones place. :)
It might not hurt to do some basic approximations here: do the EXPLAIN estimate, then if it's a smallish value, go ahead and grab an exact count.
If it's a very large value, then take the estimated count *and pass back the fact that it's an estimate* and display accordingly.
I've long advocated removing this from the Special:Statistics page; it's not information which has relevance to the site, and users seem to use it as a metric for all sorts of things, when it's not.
What would be more useful would be a historical graph showing the rise and fall of the queue size. That would help people to avoid freaking out when it's briefly large due to activity.
Can't we use the site_stats table. Have a new column 'ss_job_queue_length' which is populated with the current correct value in the rebuildall script, and then update this value whenever jobs are added or processed? Or are the figures in this table approximations as well?
- Mark Clements (HappyDog)