Re: [Wikitech-l] one week of fun

14 Jan 2008

      Bah, now I can't clear my cache whenever I edit a .js file of mine, the old
version just gets stuck in the cache. Others have had this problem at least
with .css.
Domas Mituzas wrote:
...
Hi,
Just wanted to share some of bits we've been doing this week - it was  
hopping around and analyzing our performance and application workflow  
from multiple sides (kind of "Hello 2008!!!" systems performance  
review).
It all started with application object cache - the caching arena was  
bumped up from 55GB to 160G - and here more work had to be done to  
make our parser output cacheable. Any use of magic words (and most  
templates do use them) would decrease cache TTLs to 1 hour, so vast  
increase in caching space didn't help much. Though, once this was  
fixed, pages are reparsed just once few days. Additionally, we did  
move the revision text caching for external storages to a global  
pool, instead of maintaining local caches on each of these nodes.  
That allows to reuse old external store box memory space for caching  
more actively fetched revisions, instead those archived ones.
Another major review was done on extension loading - there by  
delaying or eliminating expensive initializations, especially for  
very-rarely-used extensions (relatively :) - we did shave at least  
20ms off site base loading time (and service request average). That  
also resulted in huge CPU use reduction. Here special thanks goes to  
folks on #mediawiki (Aaron, Nikerabbit, siebrand, Simetrical, and  
others)  who joined this effort of analysis, education and  
engineering :) There're still more difficult extensions to handle,  
but I hope they will evolve into more adaptive performance-wise. This  
was long-time regression caused by increasing quality of translations

that resulted in bigger data set to handle at every page load.

A small bit, but noticeable, was simplification of  
mediawiki:pagecategories message on en.wikipedia.org. Such simple  
logic like "show Category: if there is just one category, and  
Categories: otherwise" needs a parser to be used, which invokes lots  
and lots of overhead for every page served. Those few milliseconds  
needed for that absolutely grammatically correct label could be  
counted in thousands of dollars.  :)
There were few other victims in this unequal fight. TitleBlacklist  
didn't survive the performance audit, - the current architecture of  
this feature is doing work in places it never should do, and as  
initial performance guidelines for it were not followed, it got  
disabled for a while. Also some of CentralNotice functionality was  
not optimized for work it was used after the fundraiser, so for now  
this feature is disabled. Of course, these features will be enabled -  
but they just need more work before they can run live.
On another front - in software core part - database connection flow  
was reviewed - and few adjustments were made, which reduce master  
server load quite a bit, as well as less communication is done with  
all database servers (transaction coordination was too verbose before

now it is far more lax).

Here again, some of application flow still is irrational - and might  
have quite a bit of refactoring/fixing in future. Tim pointed out  
that my knowledge of xdebug profiler is seriously outdated (my mind  
was stuck at 2.0.1 features, where 2.0.2 introduced quite significant  
changes that make life easier) ;-) Another shocking revelation was  
that CPU microbenchmarks provided by MediaWiki internal profiler were  
not accurate at all - the getrusage() call we use provides  
information rounded at 10ms each - and most of functions execute far  
faster than that. It was really amusing, that I trusted numbers,  
which were similar to rational and reasonable ones only because of  
huge profiling scale and eventual statistical magic. This a bit  
complicates profiling in general - as there's no easy way to  
determine which wait happened because of i/o blocking or context  
switches.
Few images from the performance analysis work:
http://flake.defau.lt/mwpageview.png
http://flake.defau.lt/mediawikiprofile.png (somewhere here you should  
see why TitleBlacklist died)
This one made me giggle:
http://flake.defau.lt/mwmodernart.png
Tim was questioning here if people are using wikitext for scientific  
calculations, or was that just another crazy over-templating we are  
used to see.
Such templates as Commons' 'picture of the day' one cause such output  
=) Actually - the new parser code makes far nicer graphs (at least,  
from performance engineering perspective).
And one of biggest changes happened on our Squid caching layer -  
because of how different browsers request data, we generally had  
different cache sets for IE, Firefox, Opera, Googlebot, KHTML, etc.
Now we do normalize the 'accept encoding' specified by browsers, what  
makes most of connections fall into single class.
In theory this may at least double our caching efficiency. In  
practice, we will see - the change has been live just on one cluster  
just for few hours.
As a side effect we turned off 'refresh' button on your browsers.  
Sorrty - please let us know if anything is seriously wrong with that  
(if you feel offended about your constitutional refreshing rights -  
use purge instead  :)
Additionally I've heard there has been quite a bit of development in  
new parser, as well as networking in Amsterdam ;-)
Quite a few people also noticed the huge flamewar of 'oh noes, dev  
enabled a feature despite our lack of consensus' . Now we're sending  
people to board for all the minor changes they ask for :-)
Oh, and Mark changed the scale on our 'backend service time' graph,  
which is used to measure our health and performance - now the upper  
limit is at 0.3s (used to be our minimum few years ago) instead of  
old 1s:
http://www.nedworks.org/~mark/reqstats/svctimestats-weekly.png
So, that much of fun we've seen this week in site operations :)
Cheers,
Domas
P.S. I'll spend next week in Disneyworld instead ;-)~~

Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
http://lists.wikimedia.org/mailman/listinfo/wikitech-l
-- 
View this message in context: http://www.nabble.com/one-week-of-fun-tp14792616p14802761.html
Sent from the Wikipedia Developers mailing list archive at Nabble.com.

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] one week of fun