Memcached layer for revision text extraction - Wikitech-l

18 Sep 2006


      With the start of the school season, the request rate and thus load have gone
way up in the last few weeks. A few more Apache boxen are being planned for, but
we also have some optimization to do in storage access patterns.
On Domas's urging I've added a cache for extracted revision text using
memcached. This can reduce both the load on the external storage servers as well
as the time spent decompressing and extracting individual items from a bulk
storage blob.
According to live profiling data on en.wikipedia and de.wikipedia, the time
spent in Revision::getRevisionText has dropped from the 40s to the 10s in
percentage, with a corresponding increase in memcached::get from about 8% to
about 18%. That's a drop of about 200ms per average profiled request in realtime
during the US-daytime-Europe-evening rush.
Hopefully it keeps up and doesn't cause any other problems; I've set it to cache
for an hour, but this could be reduced or increased as necessary.
The setting is $wgRevisionCacheExpiry and is available in 1.8 on SVN trunk,
disabled by default. To be useful, it requires the main $wgMemc cache to be set
up, and you should have a fair amount of external storage in use or else it
might just be slower to make all those memcached queries. :)
-- brion vibber (brion @ pobox.com)