Some of those will be hosts that are deliberately shut down (I have at least 4 hosts in deployment-prep like this pending deletion), actually we should probably make shinkengen check for what nova says the status is.
Will look into it

On Fri, 15 Jun 2018, 20:00 Andrew Bogott, <abogott@wikimedia.org> wrote:
I'm very glad you're keeping an eye on those!  Shinken reports many more breakages; I guess that's mostly an issue with purging down or no-longer-existing VMs.

On 6/15/18 1:33 PM, Alex Monk wrote:
Thought I'd mention something I worked on recently - I have a cron on deployment-cumin that runs puppet across everything (that openstack says is running) and emails me with a list of hosts with any problems. It has a little config allowing associating a host with a task. deployment-prep is looking better than I thought.
(deploy-01 got broken in the security updates and I'm planning to look into cache-text04 later - think this was a timeout of some sort, likely related to my certificate work there)

---------- Forwarded message ----------
From: <krenair@beta.wmflabs.org>
Date: 15 June 2018 at 19:05
Subject: Deployment-prep Puppet error hosts report
To: krenair@gmail.com


Hostname Task?
deployment-cache-text04.deployment-prep.eqiad.wmflabs None
deployment-deploy-01.deployment-prep.eqiad.wmflabs T192561

Hosts configured with tasks but are not listing as broken anymore:
Hostname Task
deployment-mx.deployment-prep.eqiad.wmflabs T184244



_______________________________________________
Cloud-admin mailing list
Cloud-admin@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/cloud-admin


_______________________________________________
Cloud-admin mailing list
Cloud-admin@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/cloud-admin