With the start of the school season, the request rate and thus load have gone way up in the last few weeks. A few more Apache boxen are being planned for, but we also have some optimization to do in storage access patterns.
On Domas's urging I've added a cache for extracted revision text using memcached. This can reduce both the load on the external storage servers as well as the time spent decompressing and extracting individual items from a bulk storage blob.
According to live profiling data on en.wikipedia and de.wikipedia, the time spent in Revision::getRevisionText has dropped from the 40s to the 10s in percentage, with a corresponding increase in memcached::get from about 8% to about 18%. That's a drop of about 200ms per average profiled request in realtime during the US-daytime-Europe-evening rush.
Hopefully it keeps up and doesn't cause any other problems; I've set it to cache for an hour, but this could be reduced or increased as necessary.
The setting is $wgRevisionCacheExpiry and is available in 1.8 on SVN trunk, disabled by default. To be useful, it requires the main $wgMemc cache to be set up, and you should have a fair amount of external storage in use or else it might just be slower to make all those memcached queries. :)
-- brion vibber (brion @ pobox.com)
Brion Vibber wrote:
I've set it to cache for an hour, but this could be reduced or increased as necessary.
Do you mean that you have set each revision text to expire from the cache after an hour? If so, out of curiosity, why? Doesn't memcached already expire items from the cache when the cache is full, and doesn't it already choose the least recently accessed item?
Timwi
On 9/18/06, Brion Vibber brion@pobox.com wrote: [snip]
Hopefully it keeps up and doesn't cause any other problems; I've set it to cache for an hour, but this could be reduced or increased as necessary.
[snip]
Has memcached's allocator changed? I'm somewhat surprised that that widely varying size of the page text isn't causing us fragmentation problems.
Gregory Maxwell wrote:
Has memcached's allocator changed? I'm somewhat surprised that that widely varying size of the page text isn't causing us fragmentation problems.
As I mentioned during the hacking days, the allocator granularity and semantics have been improved (along with a pile of other things) by Facebook in what is about to become memcached 1.2.0. If we see fragmentation errors, a switch to 1.2 is in order.
Ivan Krstić wrote:
Gregory Maxwell wrote:
Has memcached's allocator changed? I'm somewhat surprised that that widely varying size of the page text isn't causing us fragmentation problems.
As I mentioned during the hacking days, the allocator granularity and semantics have been improved (along with a pile of other things) by Facebook in what is about to become memcached 1.2.0. If we see fragmentation errors, a switch to 1.2 is in order.
I don't know what you mean by fragmentation. Memcached's problem is fixed distribution of memory between size classes, which can lead to one size class having a very different expiry time to another. I've never heard such a problem referred to as fragmentation. Allocation and deallocation are constant-time operations, so fragmentation of memory within a slab is irrelevant.
I imagine the revision cache would have a similar size distribution to the parser cache, which was the dominant item previously in terms of size.
-- Tim Starling
wikitech-l@lists.wikimedia.org