Hopefully I have the right list now...
I was a developer a while back for the (purportedly) 17th busiest website in
the world and though the sysadmins were more directly involved in improving
response speed, I ended up doing a lot of stuff myself. The way that
Wikipedia is slow is very similar to problems we had. The delays are almost
all in the initial request for new pages. Once the connection is made,
content usually comes across rapidly. This usually points to some sort of
full queue in software, or a full queue due to excessive connections on a
single machine causing a hardware wait state. New url requests are made to
stand in line, sometimes because settings for maximum simultaneous
connections are too low, or the settings are high enough but all RAM is
consumed servicing current requests, etc,. This may seem obvious, but it
lets us de-emphasize other potential problems such as bloated overworked DB,
bogged disk fetches, etc. So, based on all this, I would say the greatest
single improvement would be to set up some sort of simple DNS round robin
(true load balancing could come later). I'm not sure what your current
server setup is, but if you could have at least two Apache servers running
on two machines with one of them running the Round Robin algorithm I think
the majority of your response problems would disappear. Don't listen to
those who say Round Robin is a naive approach. It's true that allocation of
new connections is done in a "dumb" way (in a two server setup it will just
throw every other connection to the secondary webserver)-- but that's all
you really need, I think. Suddenly each machine is servicing half the client
connections and everything is fast... Of course, maybe the reasons for your
slowness are more complex, but based on what I can see from the client side
my suspicion is that a simple Round Robin would clear it all up and that
simply adding new Apache processes on new servers as you grow would make you
at least 10 times faster during peak times than at present. -- JDG --