[QA] Fwd: [Cloud] [Cloud-announce] Maintenance reboots TODAY

Greg Grossmeier greg at wikimedia.org
Tue Jan 16 19:16:33 UTC 2018


FYI, Beta Cluster will be in any state of just fine to completely broken
throughout today as the hosts are rebooted to apply a security patch to
the kernel.

Sorry for the inconvenience.

Greg

----- Forwarded message from Andrew Bogott <abogott at wikimedia.org> -----

> Date: Tue, 16 Jan 2018 08:57:57 -0600
> From: Andrew Bogott <abogott at wikimedia.org>
> To: Cloud-announce at lists.wikimedia.org
> Subject: [Cloud] [Cloud-announce] Maintenance reboots TODAY
> Reply-To: cloud at lists.wikimedia.org
> 
> Good morning!
> 
> The canary reboots last week went well, so we'll be upgrading and rebooting
> the rest of the cloud over the course of the day today, beginning in a few
> minutes.
> 
> As always, we'll do our best to minimize effects within toolforge, although
> it's always a good idea to make sure your jobs are still running after
> windows like this.  The list of VMs from last week (attached below) are
> already good to go so they should be unaffected today.
> 
> -Andrew
> 
> 
> 
> On 1/11/18 3:15 PM, Andrew Bogott wrote:
> > Today's round of reboots is now finished -- the hosts rebooted are
> > listed below.
> > 
> > One correction:  Monday is a holiday, so we're planning to reboot the
> > rest of the fleet on Tuesday, January 16th.  Any VMs not in the list
> > below should anticipate downtime at some point on Tuesday.
> > 
> > -Andrew
> > 
> > 
> > On 1/11/18 1:02 PM, Andrew Bogott wrote:
> > > In a few minutes I'm going to start the first round of reboots. 
> > > We're going to do a subset of the cloud and then make sure there are
> > > no bad effects before doing the remainder on Monday.
> > > 
> > > The following VMs will be upgraded and rebooted over the next few hours:
> > > 
> > > 
> > > aborrero-test: puppet-vm
> > > account-creation-assistance: accounts-dbslave
> > > analytics: hadoop-worker-3
> > > analytics: k3-1
> > > analytics: k3-2
> > > automation-framework: af-debmonitor
> > > automation-framework: af-puppetdb02
> > > butterfly: butterfly-m4m
> > > catgraph: fishbone
> > > cvn: cvn-apache9
> > > cvn: cvn-app8
> > > cvn: cvn-app9
> > > cyberbot: cyberbot-exec-01
> > > cyberbot: cyberbot-exec-iabot-01
> > > deployment-prep: deployment-cassandra3-02
> > > deployment-prep: deployment-cpjobqueue
> > > deployment-prep: deployment-kafka-jumbo-1
> > > deployment-prep: deployment-memc05
> > > deployment-prep: deployment-mx
> > > deployment-prep: deployment-netbox
> > > deployment-prep: deployment-redis01
> > > deployment-prep: deployment-redis05
> > > deployment-prep: deployment-sca01
> > > deployment-prep: deployment-sca03
> > > discovery-stats: language-detector-01
> > > dwl: taxonbot
> > > git: gerrit-test
> > > git: gerrit-test3
> > > glampipe: Glampipe
> > > globaleducation: women-in-red
> > > hhvm: hhvm-jmm
> > > huggle: huggle-wl
> > > integration: integration-slave-docker-1004
> > > integration: integration-slave-docker-1005
> > > integration: integration-slave-jessie-1003
> > > integration: integration-slave-jessie-1004
> > > kubernetes-testing: kmaster
> > > language: language-dev
> > > mediawiki-vagrant: mwv-stretch-migration
> > > monitoring: filippo-test-jessie3
> > > mwstake: mwstake
> > > ogvjs-integration: media-streaming
> > > otrs: otrs-oneclickspam-test
> > > phabricator: puppet-phabricator
> > > planet: puppenmeister
> > > pluggableauth: cindy
> > > pluggableauth: oidc-google
> > > privpol-captcha: captcha-consul-32
> > > privpol-captcha: captcha-tf-31
> > > project-smtp: smtp-test1
> > > rcm: oxygen
> > > reading-web-staging: chromium-pdf
> > > reading-web-staging: proton-staging
> > > recommendation-api: recommendation-api-build
> > > redirects: redirects-nginx01
> > > scrumbugz: wikibase-docker-20171109-1
> > > search: search-jessie
> > > security-tools: jobs
> > > security-tools: scanner00
> > > security-tools: two-factor
> > > security-tools: xsstest
> > > sentry: sentry-builder
> > > services: ceph-1
> > > services: pdfservice
> > > services: sca1
> > > suggestbot: suggestbot-prod
> > > swift: swift-prometheus
> > > testlabs: puppet-compiler-tools
> > > testlabs: puppet-compiler-v4-tools
> > > testlabs: util-abogott
> > > toolserver-legacy: relic
> > > traffic: traffic-misc-varnish5
> > > traffic: traffic-peerassist
> > > traffic: traffic-upload-varnish5
> > > ttmserver: ttmserver-elasticsearch01
> > > ttmserver: ttmserver-salt01
> > > twl: twlight-prod
> > > twl: twlight-staging
> > > wikibrain: wikibrain-embeddings-02
> > > wikidata-dev: elastic-wikidata
> > > wikidata-query: wdqs-deploy
> > > wikidata-topicmaps: wtui-new
> > > wikifactmine: elasticsearch-01
> > > wikimania-support: scholarships-02
> > > wmam: wikikids
> > > yandex-proxy: yandex-proxy01
> > > 
> > > 
> > > 
> > > On 1/4/18 9:28 AM, Andrew Bogott wrote:
> > > > Sometime soon (probably in the next day or two) we will be
> > > > applying kernel patches to all VMs and physical hosts in WMCS.
> > > > This is to address an urgent security issue[1] , so we'll be
> > > > skipping the traditional 7-day warning period -- basically as
> > > > soon as proper fixes are available we'll start patching and
> > > > rebooting.
> > > > 
> > > > As usual, we'll do our best to re-balance Toolforge grid nodes,
> > > > so impact on Toolforge users should be minimal (worst case you
> > > > may need to manually restart interrupted tasks).
> > > > 
> > > > For other users: if your VPS project requires special handling
> > > > or specific notice about when a particular VM will reboot,
> > > > please add a subtask describing your need to
> > > > https://phabricator.wikimedia.org/T184189 .
> > > > 
> > > > 
> > > > [1] https://en.wikipedia.org/wiki/Meltdown_(security_vulnerability)
> > > > 
> > > 
> > 
> 
> 
> _______________________________________________
> Wikimedia Cloud Services announce mailing list
> Cloud-announce at lists.wikimedia.org (formerly labs-announce at lists.wikimedia.org)
> https://lists.wikimedia.org/mailman/listinfo/cloud-announce
> _______________________________________________
> Wikimedia Cloud Services mailing list
> Cloud at lists.wikimedia.org (formerly labs-l at lists.wikimedia.org)
> https://lists.wikimedia.org/mailman/listinfo/cloud

----- End forwarded message -----

-- 
| Greg Grossmeier            GPG: B2FA 27B1 F7EB D327 6B8E |
| Release Team Manager            A18D 1138 8E47 FAC8 1C7D |



More information about the QA mailing list