Today Friday, the front page of the English Wikipedia has been fast all day.
Another page (I monitor http://www.wikipedia.com/wiki/Sweden) was slow for one period of 30 minutes (09:30-10:00 am GMT) and another period of two hours (11:40-13:50 GMT). Some other URLs on the international Wikipedias were also affected at the same time. This might be due to maintenance or work being done on the scripts.
Subtract 7 hours from GMT to get the server's local time zone (PDT = GMT -0700).
Apart from these two limited intervals, every URL that I monitor have been fast all day, including the recent changes pages.
I'm very happy with this, and hope Brion and Jimmy (and who else?) will soon get the talk namespace links back without hurting performance. (But hey, never make big fixes five minutes before you leave for the weekend! Better just leave it as is if you have to go.)
And now for some more relaxed Friday reading, actually related to performance problems. (The following analysis might be politically slanted. Don't take it too seriously.) The Swedish parliament elections are coming up in September, so the political parties are starting up their campaigns. The problem is there are no big issues to fight about. The four non-socialist parties have unusually boring candidates (Dukakis style), and everybody expects the current social-democratic government to win. The single issue that seems to be coming up is the national sick leave insurance, which is paid by tax money, and far over budget. This is linked to the fact that "burn-out" is now an accepted medical diagnosis for which you are allowed to take a long sick leave on the tax payers' expense. You would expect such welfare excesses to be on the social democrat agenda, and that non-socialists would urge for tax cuts and a balanced budget. However, the current s-d govt has been doing a great job balancing the budget, and they will now have to deal with cutting back this overgenerous sick leave compensation without hurting their voters' feelings. Tough job. The Christian-democratic party's candidate has already hurt a lot of feelings by claiming that "some" of those receiving compensation are "cheating the system". That might be true, but accusing "some" (who? me?) is obviously not the way to attract voters. This issue now has media attention and some interesting example cases are reported.
Like this one: Attorneys in Swedish district courts have been right-sized in the past years, as part of balancing the budget. This means that as soon as one gets sick, the rest get too much to do, leading to stress and burn-out, which leads to more sick leaves.
Think of the court cases as HTTP requests arriving to Wikipedia. There are some processes/attorneys there to handle the cases, but for some reason one process gets blocked and cannot work. This leaves more work for the remaining workers, but they are probably waiting for the first process to get finished and unlock the resources (database records?) that it is using. If processes are allowed to go to sleep waiting for each other, the work will pile up. It will never end.
So, what is the solution? Throwing more attorneys at the problem? Maybe, but more likely the work processes should be redesigned and simplified. That allows the available attorneys to finish up a case and take on the next one. Some of their tasks are more important than others, but the performance or throughput of the system depends on cutting away or redesigning the most time-consuming tasks. The high degree of sick-leave is an indicator of system design flaws (albeit an one), and thus not altogether bad.
In the same way, a high "load average" (as reported by the "uptime" or "top" commands) is one indicator that the Wikipedia system is flawed. The load average in a UNIX system is the number of processes that are ready to run, waiting for the CPU to become available. Unfortunately, most of them are just waiting to see if their wanted resource has become available. If this is not the case (e.g. database record still locked), they will go back to the end of the line, waiting again. Do you remember those bread shop waiting lines in Soviet Russia?
Training new attorneys is in itself a time-consuming task, which should be avoided if possible. Instead of paying sick leave (for how long?) to the already trained attorneys, a "cure" for "burn-out" should be found that can bring them back to work, thus relieving the overload from their colleagues and saving tax payers' money at the same time.
I have no idea how a "cure" for burn-out can be found, but I think it is a necessary political trick, and thus will happen. It will not hurt voters' feelings, and it is my guess that the people who can achieve this will work for the winners of the election.
This might be the weakest analogy in history, but I think we should treat the Wikipedia processes with the same dignity and respect that the Swedish voters would expect. After all, they're supposed to work for us. The processes feel self-fulfillment when they can finish their job on time, and get distressed when they get locked up. Any uncalled for delay will only result in more work piling up. That is a flaw in the system design that has to be fixed, and we cannot go around claiming that "some" of the workers are trying to cheat the system. That will only lead to us losing their confidence.