William Allen Simpson wrote:
Time to time, I'm sure we've all noticed that
wikipedia slows to a crawl.
Such as last night (local time), for about 15-20 minutes, reading was poor,
writing was nearly impossible, see:
April 20, 2006, 10:36 pm
http://www.thewritingpot.com/wikistatus/
What is "local time"? Please state your times in UTC. The page you link to
doesn't go back as far as April 20, and it doesn't appear to have any
archive links.
In any case, there's not much point in complaining about slow response times
a day after the fact. As I told you before, the best place to contribute to
this sort of thing is on #wikimedia-tech.
http://mail.wikimedia.org/pipermail/wikitech-l/2006-April/034991.html
I tried looking at the site from various views. What
struck me was that
no matter where I looked from here in the US, east or west or central, all
traffic seems to go to Florida, even when the servers are not responding.
No failover to other clusters?
There are no other clusters which fill the same role as pmtpa. Go to this page:
http://meta.wikimedia.org/wiki/Profiling/20051208
and tell me how fast the site would be if every one of those Database::query
or memcached::get calls required a couple of transatlantic RTTs. Using
centralised caches improves the hit rate, and keeping them within a few
kilometres of the apache servers makes the latency acceptable.
Also, the DNS stopped serving inverse addresses.
Compare:
[...]
That 84.40.24.22 inverse is only at 2 DNServers both
located on the same
subnet (very bad practice):
Maybe you should complain to whoever owns those servers.
[...]
However, that loss of DNS responses from the same
subnet leads to the
conclusion the subnet might be under congestive collapse. That is, this
lag might not be produced by wikimedia itself, but a problem with the
link to or within the facility.
I very much doubt it. Did you try testing for packet loss by pinging a
Wikimedia server?
Is there any other data that might correspond?
Does anybody have clues or notes on what actually might have been
happening at the time? RTG/MRTG?
Our MRTG stuff is still down following the loss of larousse, but you can
still use these:
http://ganglia.wikimedia.org/
http://tools.wikimedia.de/~leon/stats/reqstats/
https://wikitech.leuksman.com/view/Server_admin_log
-- Tim Starling