Time to time, I'm sure we've all noticed that wikipedia slows to a crawl. Such as last night (local time), for about 15-20 minutes, reading was poor, writing was nearly impossible, see:
April 20, 2006, 10:36 pm http://www.thewritingpot.com/wikistatus/
I tried looking at the site from various views. What struck me was that no matter where I looked from here in the US, east or west or central, all traffic seems to go to Florida, even when the servers are not responding.
No failover to other clusters?
Also, the DNS stopped serving inverse addresses. Compare:
9 ae-23-54.car3.tampa1.level3.net (4.68.104.107) 222.648 ms ae-13-55.car3.tampa1.level3.net (4.68.104.139) 221.783 ms ae-13-53.car3.tampa1.level3.net (4.68.104.75) 223.539 ms 10 level3-co1.tpax.as30217.net (4.71.0.10) 224.125 ms 222.308 ms 223.698 ms 11 e1-1.dr1.tpax.as30217.net (84.40.24.22) 230.567 ms 222.853 ms 227.562 ms 12 gi0-50.csw1-pmtpa.wikimedia.org (64.156.25.242) 222.394 ms 223.082 ms 223.56 ms 13 rr-206.pmtpa.wikimedia.org (207.142.131.206) 225.189 ms 215.542 ms 224.085 ms
11 ae-23-54.car3.Tampa1.Level3.net (4.68.104.107) 51.362 ms ae-13-51.car3.Tampa1.Level3.net (4.68.104.11) 51.299 ms ae-13-53.car3.Tampa1.Level3.net (4.68.104.75) 51.291 ms 12 level3-co1.tpax.as30217.net (4.71.0.10) 54.396 ms 53.682 ms 53.826 ms 13 84.40.24.22 (84.40.24.22) 54.127 ms 53.686 ms 53.826 ms 14 gi0-50.csw1-pmtpa.wikimedia.org (64.156.25.242) 59.873 ms 58.579 ms 55.517 ms 15 rr-235.pmtpa.wikimedia.org (207.142.131.235) 53.879 ms 54.104 ms 53.891 ms
That 84.40.24.22 inverse is only at 2 DNServers both located on the same subnet (very bad practice):
;; ANSWER SECTION: 22.24.40.84.in-addr.arpa. 200 IN PTR e1-1.dr1.tpax.as30217.net.
;; AUTHORITY SECTION: 24.40.84.in-addr.arpa. 200 IN NS rns1.powermedium.com. 24.40.84.in-addr.arpa. 200 IN NS rns2.powermedium.com.
;; ADDITIONAL SECTION: rns1.powermedium.com. 14128 IN A 84.40.24.94 rns2.powermedium.com. 14128 IN A 84.40.24.98
However, that loss of DNS responses from the same subnet leads to the conclusion the subnet might be under congestive collapse. That is, this lag might not be produced by wikimedia itself, but a problem with the link to or within the facility.
Is there any other data that might correspond?
Does anybody have clues or notes on what actually might have been happening at the time? RTG/MRTG?
- - - Also, I'm seeing incorrect DNS configuration (CNAME to CNAME): ;; ANSWER SECTION: en.wikipedia.org. 92 IN CNAME rr.wikimedia.org. rr.wikimedia.org. 600 IN CNAME rr.pmtpa.wikimedia.org.
;; AUTHORITY SECTION: wikimedia.org. 7200 IN SOA ns0.wikimedia.org. hostmaster.wikimedia.org. 2006041914 43200 7200 1209600 3600