Thank you very much, Andrew and company.
Followup on this:
The WMCS team is pretty sure that all user-facing services have been
restored. If you encounter any current unexpected breakage, please email
me directly or use !help on IRC.
There's still a fair bit of less-urgent cleanup left to do. Puppet will
remain disabled on most VMs until that's finished, which may take a day
or two.
-Andrew + the WMCS team.
On 6/4/20 10:18 AM, Bryan Davis wrote:
> At 2020-06-04T11:12 UTC a change was merged to the
> operations/puppet.git repository which resulted in data loss for Cloud
> VPS projects using a local Puppetmaster
> (role::puppetmaster::standalone). The specific data loss is removal of
> any local to the Puppetmaster instance commits overlaid on the
> upstream labs/private.git repository. These patches would have
> contained passwords, ssh keys, TLS certificates, and similar
> authentication information for Puppet managed configuration.
>
> The majority of Cloud VPS projects are not affected by this
> configuration data loss. Several highly used and visible projects,
> including Toolforge (tools) and Beta Cluster (deployment-prep), have
> some impact. We have disabled Puppet across all Cloud VPS instances
> that were reachable by our central command and control service (cumin)
> and are currently evaluating impact and recovering data from
> /var/logs/puppet.log change logs where available.
>
> More information will be collected at
> <https://phabricator.wikimedia.org/T254491> and an incident report
> will also be prepared once the initial response is complete.
>
> Bryan
_______________________________________________
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly labs-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud