Some of those will be hosts that are deliberately shut down (I have at
least 4 hosts in deployment-prep like this pending deletion), actually we
should probably make shinkengen check for what nova says the status is.
Will look into it
On Fri, 15 Jun 2018, 20:00 Andrew Bogott, <abogott(a)wikimedia.org> wrote:
I'm very glad you're keeping an eye on those!
Shinken reports many more
breakages; I guess that's mostly an issue with purging down or
no-longer-existing VMs.
On 6/15/18 1:33 PM, Alex Monk wrote:
Thought I'd mention something I worked on recently - I have a cron on
deployment-cumin that runs puppet across everything (that openstack says is
running) and emails me with a list of hosts with any problems. It has a
little config allowing associating a host with a task. deployment-prep is
looking better than I thought.
(deploy-01 got broken in the security updates and I'm planning to look
into cache-text04 later - think this was a timeout of some sort, likely
related to my certificate work there)
---------- Forwarded message ----------
From: <krenair(a)beta.wmflabs.org>
Date: 15 June 2018 at 19:05
Subject: Deployment-prep Puppet error hosts report
To: krenair(a)gmail.com
Hostname Task?
deployment-cache-text04.deployment-prep.eqiad.wmflabs None
deployment-deploy-01.deployment-prep.eqiad.wmflabs T192561
<https://phabricator.wikimedia.org/T192561>
Hosts configured with tasks but are not listing as broken anymore:
Hostname Task
deployment-mx.deployment-prep.eqiad.wmflabs T184244
_______________________________________________
Cloud-admin mailing
listCloud-admin@lists.wikimedia.orghttps://lists.wikimedia.org/mailman/listinfo/cloud-admin
_______________________________________________
Cloud-admin mailing list
Cloud-admin(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/cloud-admin