[Labs-l] Random issues that require an OPs attention to fix
Damian Zaremba
damian at damianzaremba.co.uk
Sat Oct 6 16:43:25 UTC 2012
1) DNS is broken/half working/annoying/argh
phoenix:~ damian$ dig wmflabs.org NS @labs-ns0.wikimedia.org
; <<>> DiG 9.6-ESV-R4-P3 <<>> wmflabs.org NS @labs-ns0.wikimedia.org
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 17397
;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0
;; WARNING: recursion requested but not available
;; QUESTION SECTION:
;wmflabs.org. IN NS
;; Query time: 150 msec
;; SERVER: 208.80.152.33#53(208.80.152.33)
;; WHEN: Sat Oct 6 17:33:03 2012
;; MSG SIZE rcvd: 29
phoenix:~ damian$ dig wmflabs.org NS @labs-ns1.wikimedia.org
; <<>> DiG 9.6-ESV-R4-P3 <<>> wmflabs.org NS @labs-ns1.wikimedia.org
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 46082
;; flags: qr aa rd; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 0
;; WARNING: recursion requested but not available
;; QUESTION SECTION:
;wmflabs.org. IN NS
;; ANSWER SECTION:
wmflabs.org. 3600 IN NS labs-ns1.wikimedia.org.
wmflabs.org. 3600 IN NS labs-ns0.wikimedia.org.
;; Query time: 175 msec
;; SERVER: 208.80.154.19#53(208.80.154.19)
;; WHEN: Sat Oct 6 17:33:09 2012
;; MSG SIZE rcvd: 85
Also, the SOA is wrong as it still points to virt0;
phoenix:~ damian$ dig wmflabs.org SOA @labs-ns1.wikimedia.org
; <<>> DiG 9.6-ESV-R4-P3 <<>> wmflabs.org SOA @labs-ns1.wikimedia.org
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 46569
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0
;; WARNING: recursion requested but not available
;; QUESTION SECTION:
;wmflabs.org. IN SOA
;; ANSWER SECTION:
wmflabs.org. 3600 IN SOA virt0.wikimedia.org.
hostmaster.wikimedia.org. 1349449000 1800 3600 86400 7200
;; Query time: 128 msec
;; SERVER: 208.80.154.19#53(208.80.154.19)
;; WHEN: Sat Oct 6 17:33:39 2012
;; MSG SIZE rcvd: 92
2) Instance reboots tend to result in instances never coming back -
please could someone fix bots-cb (same as sql2, first reboot took it
down, second results in 'failed').
3) Login's randomly fail due to key auth timing out (seems to be related
to nfs crapping out)
4) Home dirs sometimes randomly drop their mounts (seems to be related
to nfs crapping out also, dmesg just shows rpc timeouts)
(Yes, I know it's a Saturday but as the guy in Code Rush said; Writing
software is different from selling real estate. Selling real estate you
sell the people the people sleep at night. When they go to sleep you
have to stop selling real estate. Computers never sleep.)
Damian
More information about the Labs-l
mailing list