[Labs-admin] Ops meeting and Clinic duty update 6/12

Madhumitha Viswanathan mviswanathan at wikimedia.org
Mon Jun 12 18:10:55 UTC 2017


Our meeting etherpad - https://etherpad.wikimedia.org/p/WMCS-2017-06-13
Ops meeting etherpad - https://etherpad.wikimedia.org/p/TechOps-2017-06-12
Ops goals etherpad -
https://etherpad.wikimedia.org/p/TechOps-goals-FQ1-FY1718

Ops meeting today was centered around Q1 2017 goals. The etherpad has all
the details, these are the things that came in contact with our team (Bryan
did most of the talking)

1. Salt deprecation goal - Ricardo is going to try and work on having cumin
work for labs as well this quarter. Faidon requested some support from us
in helping him figure out how to use the Openstack API and such to make it
a bit easier on them, Bryan agreed.

2. Turning on future parser for puppet 3.8 - It seems like their goal here
will just be to raise awareness this quarter about breaking parser changes,
and do the actual switch only next quarter or so. There were talks about
having the puppet compiler warn where parser uses are deprecated per the
future parser. For us they said this means over the next few things, they
are going to be making puppet changes, that could break things for us, and
Dzahn raised that we don't have the watroles tool now which makes it hard
for him to see what he may be breaking when he changes a role. Bryan seemed
to think so too, and we should talk about may be getting that tool built
sooner than later.

3. Phasing out trusty - I don't think this is going to be a goal, but
seemed reporting on remaining trusty instances is going to be an ops
meeting thing from now on. Mark/Ricardo mentioned the labsdb boxes, and
Bryan said we are down with deprecating them and working on the new servers
when the DBAs give us the green light on all the data being imported to the
new Public Wiki Replicas (is what i'm going to call labsdb) databases.

Clinic duty update:

- Noticed that no one really uses !help on pings the oncall person, so a
lot of times you're the frontline person only if you are constantly reading
#wikimedia-cloud
- Some complaining about the redirect causing a kick without explanation
from the labs channel and the new name this week - Mostly pointed them to
the list announcements and blog, and the renaming discussion
- Saw from the Tools grafana board that 3 lighttpd instances were reporting
disk space critical on /tmp - followed up and learnt that wsexport was
generating loads of temp files. Commented here -
https://phabricator.wikimedia.org/T166337, and cleaned the instances up
- Similar disk space alert on tools-bastion-03 -
https://phabricator.wikimedia.org/T167542 due to tool bothasava writing too
many cache files in bastion /tmp. They've moved it to NFS now in
/data/project - not sure that's the best of ideas - will follow up on the
ticket.

-- 
Madhu :)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.wikimedia.org/pipermail/labs-admin/attachments/20170612/a793b538/attachment.html>


More information about the Labs-admin mailing list