Time to time, I'm sure we've all noticed that wikipedia slows to a crawl.
Such as last night (local time), for about 15-20 minutes, reading was poor,
writing was nearly impossible, see:
April 20, 2006, 10:36 pm
http://www.thewritingpot.com/wikistatus/
I tried looking at the site from various views. What struck me was that
no matter where I looked from here in the US, east or west or central, all
traffic seems to go to Florida, even when the servers are not responding.
No failover to other clusters?
Also, the DNS stopped serving inverse addresses. Compare:
9
ae-23-54.car3.tampa1.level3.net (4.68.104.107) 222.648 ms
ae-13-55.car3.tampa1.level3.net (4.68.104.139) 221.783 ms
ae-13-53.car3.tampa1.level3.net (4.68.104.75) 223.539 ms
10
level3-co1.tpax.as30217.net (4.71.0.10) 224.125 ms 222.308 ms 223.698 ms
11
e1-1.dr1.tpax.as30217.net (84.40.24.22) 230.567 ms 222.853 ms 227.562 ms
12
gi0-50.csw1-pmtpa.wikimedia.org (64.156.25.242) 222.394 ms 223.082 ms 223.56 ms
13
rr-206.pmtpa.wikimedia.org (207.142.131.206) 225.189 ms 215.542 ms 224.085 ms
11
ae-23-54.car3.Tampa1.Level3.net (4.68.104.107) 51.362 ms
ae-13-51.car3.Tampa1.Level3.net (4.68.104.11) 51.299 ms
ae-13-53.car3.Tampa1.Level3.net (4.68.104.75) 51.291 ms
12
level3-co1.tpax.as30217.net (4.71.0.10) 54.396 ms 53.682 ms 53.826 ms
13 84.40.24.22 (84.40.24.22) 54.127 ms 53.686 ms 53.826 ms
14
gi0-50.csw1-pmtpa.wikimedia.org (64.156.25.242) 59.873 ms 58.579 ms 55.517 ms
15
rr-235.pmtpa.wikimedia.org (207.142.131.235) 53.879 ms 54.104 ms 53.891 ms
That 84.40.24.22 inverse is only at 2 DNServers both located on the same
subnet (very bad practice):
;; ANSWER SECTION:
22.24.40.84.in-addr.arpa. 200 IN PTR
e1-1.dr1.tpax.as30217.net.
;; AUTHORITY SECTION:
24.40.84.in-addr.arpa. 200 IN NS
rns1.powermedium.com.
24.40.84.in-addr.arpa. 200 IN NS
rns2.powermedium.com.
;; ADDITIONAL SECTION:
rns1.powermedium.com. 14128 IN A 84.40.24.94
rns2.powermedium.com. 14128 IN A 84.40.24.98
However, that loss of DNS responses from the same subnet leads to the
conclusion the subnet might be under congestive collapse. That is, this
lag might not be produced by wikimedia itself, but a problem with the
link to or within the facility.
Is there any other data that might correspond?
Does anybody have clues or notes on what actually might have been
happening at the time? RTG/MRTG?
- - -
Also, I'm seeing incorrect DNS configuration (CNAME to CNAME):
;; ANSWER SECTION:
en.wikipedia.org. 92 IN CNAME
rr.wikimedia.org.
rr.wikimedia.org. 600 IN CNAME
rr.pmtpa.wikimedia.org.
;; AUTHORITY SECTION:
wikimedia.org. 7200 IN SOA
ns0.wikimedia.org.
hostmaster.wikimedia.org. 2006041914 43200 7200 1209600 3600