I've been trying to profile the system to find the bottleneck.
Some simple disk monitoring showed very high levels of disk activity, both read and write, yesterday. This is surprising, as the site should be able to run mostly out of memory, except for writes to update the database when pages are edited.
Rates were around 250 reads/second and 250 writes/second.
Disk heads and spins are a very scarce resource, and contention on them will make any existing database locking problems far worse. I've got a couple of questions which might help me:
* Can someone who knows the code tell me whether there is much use of intermediate files in the Wikipedia software?
* Can someone who knows the hardware tell me how many drives are in the RAID, and what RAID level is used?
Neil
wikitech-l@lists.wikimedia.org