Timwi wrote:
Tim Starling wrote:
Timwi wrote:
My guess is that the slowest part of it is
checking whether a page
exists, and if it does, checking its size (if the user has set the
preference that shows stubs in a different colour), because both of
this requires a database query.
What, even with the linkscc cache and the memcached link cache? If you
say so.
I apologise if my comment was in any way offensive to you, but please do
take note of the fact that (a) I said it was a guess; (b) I did mention
somewhere else that I have no real idea to what extent memcached is
already being used; (c) I have not attacked you, or even addressed you
at all.
You're misjudging my tone, I'm not offended. I'm just making fun of you :)
With that said, please may I humbly ask what "the
linkscc cache"
actually caches? What exactly is stored in each memcache key here?
linkscc is a database table which stores a serialised, compressed
LinkCache object. This contains an array of "good" links, an array of
"bad" links, and an array of image links from a given page. Hence when a
page is loaded the script only needs one DB query to find out which
links exist. It turns out that only a small proportion of users have a
stub threshold. Viewing pages with a stub threshold does indeed require
lots of DB queries.
In addition, the memcached "lc" keys store an article ID for each page
title. This improves efficiency where linkscc is not used, for example
when saving a page.
Isn't history compression going to be detrimental
to CPU usage rather
than beneficial? I am still finding it hard to understand why so many
people here feel that history compression is necessary.
It's detrimental to CPU usage, but it reduces I/O on the database
machine, allows RAM caching of more articles, and reduces the bandwidth
required for backups and slave DB synchronisation. If compression and
decompression can be done in a few milliseconds of CPU time, it's likely
that it will be beneficial overall.
Although CPU is dominant for page views, for many other features it is
the database which is dominant. When viewing history, the database is
the major bottleneck.
-- Tim Starling