[QA] Beta Cluster and CI issues (was Fwd: [Cloud] [Cloud-announce] VPS hardware failure -- downtime ongoing)

Greg Grossmeier greg at wikimedia.org
Wed Feb 13 16:11:17 UTC 2019


This WMCS outage is affecting Beta Cluster (aka: deployment-pre VPS
project) and our CI build/test servers (aka: the integration project).

Apologies for any downtime and/or slow CI responses while this is being
sorted.

Greg

----- Forwarded message from Andrew Bogott <abogott at wikimedia.org> -----

> Date: Wed, 13 Feb 2019 07:25:39 -0600
> From: Andrew Bogott <abogott at wikimedia.org>
> To: Cloud-announce at lists.wikimedia.org
> Subject: [Cloud] [Cloud-announce] VPS hardware failure -- downtime ongoing
> Reply-To: cloud at lists.wikimedia.org
> 
> We're currently experiencing a mysterious hareware failure in our datacenter
> -- three different SSDs failed overnight, two of them in cloudvirt1018 and
> one of them in cloudvirt1024.  The VMs on 1018 are down entirely.  We may
> move those on 1024 to another host shortly in order to guard against
> additional drive failure.
> 
> There's some possibility that we will experience permanent data loss on
> cloudvirt1018, but everyone is working hard to avoid this.
> 
> The following VMs are on cloudvirt1018:
> 
> 
> a11y                             | reading-web-staging
> abogott-scapserver               | testlabs
> af-puppetdb01                    | automation-framework
> api                              | openocr
> asdf                             | quotatest
> bastion-eqiad1-02                | bastion
> clm-test-01                      | community-labs-monitoring
> compiler1002                     | puppet-diffs
> cyberbot-exec-iabot-01           | cyberbot
> deployment-db03                  | deployment-prep
> deployment-db04                  | deployment-prep
> deployment-memc05                | deployment-prep
> deployment-pdfrender02           | deployment-prep
> deployment-sca01                 | deployment-prep
> design-lsg3                      | design
> eventmetrics-dev01               | eventmetrics
> fridolin                         | catgraph
> gtirloni-puppetmaster-01         | testlabs
> hadoop-master-3                  | analytics
> ign                              | ign2commons
> integration-castor03             | integration
> integration-slave-docker-1017    | integration
> integration-slave-docker-1033    | integration
> integration-slave-docker-1038    | integration
> integration-slave-jessie-1003    | integration
> integration-slave-jessie-android | integration
> k8s-master-01                    | general-k8s
> k8s-node-03                      | general-k8s
> k8s-node-05                      | general-k8s
> k8s-node-06                      | general-k8s
> kdc                              | analytics
> labstash-jessie1                 | logging
> language-mleb-legacy             | language
> login-test                       | catgraph
> lsg-01                           | design
> mathosphere                      | math
> mc-clusterA-1                    | test-twemproxy
> mwoffliner5                      | mwoffliner
> novaadminmadethis-4              | quotatest
> ntp-01                           | cloudinfra
> ntp-02                           | cloudinfra
> ogvjs-testing                    | ogvjs-integration
> phragile-pro                     | phragile
> planet-hotdog                    | planet
> pub2                             | wikiapiary
> puppenmeister                    | planet
> puppet-compiler-v4-other         | testlabs
> puppet-compiler-v4-tools         | testlabs
> quarry-beta-01                   | quarry
> signwriting-swis                 | signwriting
> signwriting-swserver             | signwriting
> social-tools3                    | social-tools
> striker-deploy04                 | striker
> striker-puppet01                 | striker
> t166878                          | otrs
> togetherjs                       | visualeditor
> tools-sgebastion-06              | tools
> tools-sgeexec-0902               | tools
> tools-sgeexec-0903               | tools
> tools-sgewebgrid-generic-0901    | tools
> tools-sgewebgrid-lighttpd-0901   | tools
> ve-font                          | design
> wikibase1                        | sciencesource
> wikicitevis-prod                 | wikicitevis
> wikifarm                         | pluggableauth
> women-in-red                     | globaleducation
> 
> 
> 
> _______________________________________________
> Wikimedia Cloud Services announce mailing list
> Cloud-announce at lists.wikimedia.org (formerly labs-announce at lists.wikimedia.org)
> https://lists.wikimedia.org/mailman/listinfo/cloud-announce
> _______________________________________________
> Wikimedia Cloud Services mailing list
> Cloud at lists.wikimedia.org (formerly labs-l at lists.wikimedia.org)
> https://lists.wikimedia.org/mailman/listinfo/cloud

----- End forwarded message -----

-- 
| Greg Grossmeier            GPG: B2FA 27B1 F7EB D327 6B8E |
| Release Team Manager            A18D 1138 8E47 FAC8 1C7D |



More information about the QA mailing list