[QA] Beta Cluster and CI issues (was Fwd: [Cloud] [Cloud-announce] VPS hardware failure -- downtime ongoing)
Greg Grossmeier
greg at wikimedia.org
Wed Feb 13 16:11:17 UTC 2019
This WMCS outage is affecting Beta Cluster (aka: deployment-pre VPS
project) and our CI build/test servers (aka: the integration project).
Apologies for any downtime and/or slow CI responses while this is being
sorted.
Greg
----- Forwarded message from Andrew Bogott <abogott at wikimedia.org> -----
> Date: Wed, 13 Feb 2019 07:25:39 -0600
> From: Andrew Bogott <abogott at wikimedia.org>
> To: Cloud-announce at lists.wikimedia.org
> Subject: [Cloud] [Cloud-announce] VPS hardware failure -- downtime ongoing
> Reply-To: cloud at lists.wikimedia.org
>
> We're currently experiencing a mysterious hareware failure in our datacenter
> -- three different SSDs failed overnight, two of them in cloudvirt1018 and
> one of them in cloudvirt1024. The VMs on 1018 are down entirely. We may
> move those on 1024 to another host shortly in order to guard against
> additional drive failure.
>
> There's some possibility that we will experience permanent data loss on
> cloudvirt1018, but everyone is working hard to avoid this.
>
> The following VMs are on cloudvirt1018:
>
>
> a11y | reading-web-staging
> abogott-scapserver | testlabs
> af-puppetdb01 | automation-framework
> api | openocr
> asdf | quotatest
> bastion-eqiad1-02 | bastion
> clm-test-01 | community-labs-monitoring
> compiler1002 | puppet-diffs
> cyberbot-exec-iabot-01 | cyberbot
> deployment-db03 | deployment-prep
> deployment-db04 | deployment-prep
> deployment-memc05 | deployment-prep
> deployment-pdfrender02 | deployment-prep
> deployment-sca01 | deployment-prep
> design-lsg3 | design
> eventmetrics-dev01 | eventmetrics
> fridolin | catgraph
> gtirloni-puppetmaster-01 | testlabs
> hadoop-master-3 | analytics
> ign | ign2commons
> integration-castor03 | integration
> integration-slave-docker-1017 | integration
> integration-slave-docker-1033 | integration
> integration-slave-docker-1038 | integration
> integration-slave-jessie-1003 | integration
> integration-slave-jessie-android | integration
> k8s-master-01 | general-k8s
> k8s-node-03 | general-k8s
> k8s-node-05 | general-k8s
> k8s-node-06 | general-k8s
> kdc | analytics
> labstash-jessie1 | logging
> language-mleb-legacy | language
> login-test | catgraph
> lsg-01 | design
> mathosphere | math
> mc-clusterA-1 | test-twemproxy
> mwoffliner5 | mwoffliner
> novaadminmadethis-4 | quotatest
> ntp-01 | cloudinfra
> ntp-02 | cloudinfra
> ogvjs-testing | ogvjs-integration
> phragile-pro | phragile
> planet-hotdog | planet
> pub2 | wikiapiary
> puppenmeister | planet
> puppet-compiler-v4-other | testlabs
> puppet-compiler-v4-tools | testlabs
> quarry-beta-01 | quarry
> signwriting-swis | signwriting
> signwriting-swserver | signwriting
> social-tools3 | social-tools
> striker-deploy04 | striker
> striker-puppet01 | striker
> t166878 | otrs
> togetherjs | visualeditor
> tools-sgebastion-06 | tools
> tools-sgeexec-0902 | tools
> tools-sgeexec-0903 | tools
> tools-sgewebgrid-generic-0901 | tools
> tools-sgewebgrid-lighttpd-0901 | tools
> ve-font | design
> wikibase1 | sciencesource
> wikicitevis-prod | wikicitevis
> wikifarm | pluggableauth
> women-in-red | globaleducation
>
>
>
> _______________________________________________
> Wikimedia Cloud Services announce mailing list
> Cloud-announce at lists.wikimedia.org (formerly labs-announce at lists.wikimedia.org)
> https://lists.wikimedia.org/mailman/listinfo/cloud-announce
> _______________________________________________
> Wikimedia Cloud Services mailing list
> Cloud at lists.wikimedia.org (formerly labs-l at lists.wikimedia.org)
> https://lists.wikimedia.org/mailman/listinfo/cloud
----- End forwarded message -----
--
| Greg Grossmeier GPG: B2FA 27B1 F7EB D327 6B8E |
| Release Team Manager A18D 1138 8E47 FAC8 1C7D |
More information about the QA
mailing list