These changes are now in effect. Please let me know if you see any unexpected behavior.
(btw, there was just now a hardware issue in the datacenter which caused some bad behavior in toolforge. That's unrelated to the naming change, and should be largely resolved now.)
-Andrew
-------- Forwarded Message -------- Subject: Daily VM migrations starting Monday, September 14th -- includes bastion outages Date: Wed, 2 Sep 2020 11:22:29 -0500 From: Andrew Bogott andrewbogott@gmail.com Reply-To: andrewbogott@gmail.com To: Cloud-announce@lists.wikimedia.org
tl;dr #1: Some VMs will have brief downtime the week of the 14th; check the lists at the bottom of this email for affected instances and timing.
tl;dr #2: Several bastions (including secondary-bastion.wmcloud.org) will be moved and rebooted at 14:00 UTC on Monday the 14th.
tl;dr #3: New ‘g2’ VM flavors will soon be available in Horizon, at which point you are discouraged from using the old ‘m1’ names.
tl;dr #4: Don’t let this announcement distract from the other important thing that’s happening: the deprecation of the .wmflabs domain for new VMs next week
== what's happening ==
In a few weeks we will begin moving VMs to our new storage platform, Ceph[0]. This move requires a full shutdown of each VM while it is copied over. We'll begin by evacuating our oldest hypervisors, cloudvirt1001-1009, two per day during the week of the 14th. Only one VM will be moved at a time, but the timing will be unpredictable for any given server.
To avoid unpredictable interruptions to ongoing work, I'm going to move the following bastion hosts first, at 14:00 UTC (that's 7AM Pacific time) on Monday the 14th. Those bastions are:
- bastion-eqiad1-02 (AKA instance-bastion-eqiad1-02.bastion.wmflabs.org AKA instance-bastion-eqiad1-02.bastion.wmcloud.org aka secondary.bastion.wmcloud.org AKA secondary.bastion.wmflabs.org)
- bastion-restricted-eqiad1-01
- tools-sgebastion-09
== before the move ==
If your VMs appear in the list below you should either plan a three-hour downtime on the day listed (14:00-17:00 UTC), or contact me on IRC to have your VM moved by hand ahead of time.
In the days preceding this move, you will see several new flavor options appear in the Horizon interface for new VMs. They will have standard stats-based names preceded by ‘g2’, for example ‘g2.cores1.ram2.disk80'. These new flavors will be bound to the Ceph backend such that any new VMs created with those flavors will be run on new hypervisors and stored on the Ceph backend. You're encouraged to start using these new flavors as soon as they appear.
== during the move ==
Each VM will be shutdown, copied, have its flavor adjusted, and then restarted. The total downtime will vary depending on the size of the VM but will generally be measured in minutes rather than hours.
== after the move ==
Migrated VMs will display in Horizon with a new flavor name. The new flavors will have the same specs (cores, ram, disk) as the former flavor but will include Ceph-specific metadata.
A side-effect of the move is that VMs from cloudvirts1001-1009 will be running on much newer hardware, so CPU-intensive activities should be quite a bit faster. File IO will be moderately slower than before. If you have a workflow that is rendered impractical by the IO changes please open a phabricator task to discuss your options.
== eventually ==
After the dust has settled from the first round of migrations (probably sometime during the week of the 23rd) we will disable creation of new non-Ceph VMs. That means that the old "m1.small"-style flavors will still display in Horizon and openstack-browser but only VMs marked with new ‘g2’ flavor names will build successfully.
Remaining VMs (on cloudvirts1012 through 1030) will be moved to Ceph in future weeks. Keep an eye out for emails announcing such moves.
== Schedule ==
Monday, 2020-09-14, 14:00-17:00 UTC: cloudvirt1001
cn-staging-1.centralnotice-staging.eqiad1.wikimedia.cloud (178d85a9-cc6f-4b61-bc3a-eaedc7e4a219)
cloud-puppetmaster-04.cloudinfra.eqiad1.wikimedia.cloud (5ea5ac40-43a7-42f3-b986-26f5803b89fc)
deploy-1002.devtools.eqiad1.wikimedia.cloud (8ffecd5f-4de6-4c89-904a-3879612da6a5)
dumps-4.dumps.eqiad1.wikimedia.cloud (8b8e9f64-4491-4cb0-85b1-f41e06772a2c)
pontoon-puppet-01.monitoring.eqiad1.wikimedia.cloud (43cdcd6e-7259-41b1-ae0d-8c4d8c1e2977)
ores-worker-02.ores.eqiad1.wikimedia.cloud (0ff5345d-53b5-4c75-998d-c7ee235c469e)
ores-misc-01.ores-staging.eqiad1.wikimedia.cloud (ee7b9541-ee56-4c83-ba6e-221a5427eb61)
wikibase-scisrc.sciencesource.eqiad1.wikimedia.cloud (da82ab0b-1a42-42fe-b139-77586aadef80)
techblog-puppetmaster-01.techblog.eqiad1.wikimedia.cloud (e16ebfcc-ee77-4f70-9057-dc5fa5fc900c)
tools-k8s-etcd-4.tools.eqiad1.wikimedia.cloud (963ec21f-7976-4765-87c2-fe66e4b1538d)
toolsbeta-workflow-test.toolsbeta.eqiad1.wikimedia.cloud (2e989cfb-5ade-4067-844a-55a9d0a967c2)
libcanada-01.wikibase-registry.eqiad1.wikimedia.cloud (596da987-3e2d-4e1a-8574-a2a5b232104d)
Tuesday, 2020-09-15, 14:00-17:00 UTC: cloudvirt1002 and cloudvirt1003
accounts-appserver5.account-creation-assistance.eqiad1.wikimedia.cloud (75538139-b1cd-4849-80fc-962e3c59c005)
blog-news.blog.eqiad1.wikimedia.cloud (4fe2be8a-7ac2-4d7a-99ab-058ae5de5bb5)
cloud-cumin-01.cloudinfra.eqiad1.wikimedia.cloud (d972a73e-08aa-4696-b95b-e50efc020ade)
deployment-docker-cxserver01.deployment-prep.eqiad1.wikimedia.cloud (60034343-58a3-4e72-8d1c-5cd7eca6da44)
deployment-docker-citoid01.deployment-prep.eqiad1.wikimedia.cloud (e96e40b6-c46c-42fd-8490-803b0c79acd8)
deployment-docker-mathoid01.deployment-prep.eqiad1.wikimedia.cloud (eb0903b7-9c0e-42a0-9545-f3b3e3e82b21)
google-api-proxy-03.google-api-proxy.eqiad1.wikimedia.cloud (fc3aa841-2933-4c37-a0ab-b7d4cdac384f)
hashtags-staging.hashtags.eqiad1.wikimedia.cloud (ff094cac-29f8-4dbc-9e32-b3e3150f6184)
soweego.soweego.eqiad1.wikimedia.cloud (810aa430-cb07-45c2-b278-0f33a0243b5b)
striker-support01.striker.eqiad1.wikimedia.cloud (53098a1a-5180-42ff-893c-2ba6ef90d054)
toolsbeta-mail-01.toolsbeta.eqiad1.wikimedia.cloud (ca82223a-0332-4557-a2fb-1964207458da)
toolsbeta-test-k8s-worker-3.toolsbeta.eqiad1.wikimedia.cloud (e3f63958-e755-49b6-b349-445283e058b8)
toolsbeta-proxy-2.toolsbeta.eqiad1.wikimedia.cloud (fb26f477-854e-4de4-919b-c9d74152a7b1)
sdc-test-runner.wikidata-federation.eqiad1.wikimedia.cloud (8bab2343-68db-4af7-aabd-59edb015e905)
xtools-prod07.xtools.eqiad1.wikimedia.cloud (ad57a225-5795-42a2-8315-de77ad96d561)
commtech-bot.commtech.eqiad1.wikimedia.cloud (3d8530fe-286f-4b35-b0af-f1499efb3adb)
puppetmaster-1001.devtools.eqiad1.wikimedia.cloud (76db63cd-0538-4015-8e9c-564d82107044)
cx-ofb.language.eqiad1.wikimedia.cloud (0f619b14-97b3-4133-95ad-8bbfaca88ab1)
meza-full.meza.eqiad1.wikimedia.cloud (d0e9258f-8e6d-42ba-9d59-365650dd6f59)
quarry-dev-01.quarry.eqiad1.wikimedia.cloud (6537f032-aa7b-41ae-9e17-7060ca69e7bc)
tools-prometheus-04.tools.eqiad1.wikimedia.cloud (2f970556-8714-48c9-bd90-b1736e4c6536)
toolsbeta-sgewebgrid-lighttpd-0901.toolsbeta.eqiad1.wikimedia.cloud (6169633b-2972-43e7-9327-7a083459084b)
traffic-ncredir.traffic.eqiad1.wikimedia.cloud (0478d505-62dd-4dad-9fd3-5ba84b9ba52b)
encoding04.video.eqiad1.wikimedia.cloud (98313eae-12f7-48de-8635-9d1d3ef33b8e)
encoding05.video.eqiad1.wikimedia.cloud (d1359499-dfb4-4303-a225-9eb52ac6ac09)
Wednesday, 2020-09-16, 14:00-17:00 UTC: cloudvirt1005, cloudvirt1007
asyncwiki-1.asyncwiki.eqiad1.wikimedia.cloud (01d14217-5b65-4cc2-a2a1-f40e2b72a843)
deployment-wikifeeds01.deployment-prep.eqiad1.wikimedia.cloud (eee76035-4494-4f99-9aa8-851dc11146c8)
discuss-space.discourse.eqiad1.wikimedia.cloud (ee52ffd6-25e4-4807-bf39-1a069ae25f1e)
dumps-5.dumps.eqiad1.wikimedia.cloud (3b380b08-7f77-4c00-bc0d-f8b3f3ed24b8)
extdist-05.extdist.eqiad1.wikimedia.cloud (80341589-cdc6-4c26-a12d-c9bd5be778aa)
gratsync.gratitude.eqiad1.wikimedia.cloud (4a95335b-6c82-4877-900e-5cbf805401c2)
language-translate.language.eqiad1.wikimedia.cloud (b6bcacfc-4fc7-4923-a1ec-cae9d5c11b89)
ocrtoy-web.ocrtoy.eqiad1.wikimedia.cloud (5ff400eb-79c8-41e8-9f19-20cc2da6e632)
ores-web-06.ores.eqiad1.wikimedia.cloud (e0c6366d-a805-47a3-82dc-de7dd1759754)
ores-worker-03.ores.eqiad1.wikimedia.cloud (ceada679-d8e5-4f5a-96ea-30967d4a9882)
tools-docker-imagebuilder-01.tools.eqiad1.wikimedia.cloud (90d1ae60-10f7-4596-9a1c-d63cf65ff1e0)
prod01.twl.eqiad1.wikimedia.cloud (d11750b9-3578-47e8-9821-2e3f6ddf6371)
deployment-elastic05.deployment-prep.eqiad1.wikimedia.cloud (f0a0822d-4a84-493d-a28b-df985bf739ba)
hashtags-prod.hashtags.eqiad1.wikimedia.cloud (35ef6320-6ceb-4fe6-93f2-e3594ed4e9db)
medbox3-iiab.iiab.eqiad1.wikimedia.cloud (9e0f875e-5a12-49b9-b2e9-8d2943e4bb97)
language-eg.language.eqiad1.wikimedia.cloud (e4e71a4b-7121-45ef-844b-775017691e37)
labs-bootstrapvz-jessie.openstack.eqiad1.wikimedia.cloud (275ccc6d-3730-42c8-8a05-5293ef0db44a)
tools-elastic-3.tools.eqiad1.wikimedia.cloud (3f7164ce-0eb5-44af-bf80-05c6bba29ec0)
tools-sgegrid-master.tools.eqiad1.wikimedia.cloud (123949a5-8b58-40e9-97db-a52709a80d5c)
toolsbeta-sgebastion-04.toolsbeta.eqiad1.wikimedia.cloud (58cf8c32-5af5-4313-8bcd-1d48124faf09)
utrs-database2.utrs.eqiad1.wikimedia.cloud (89ef3df5-2a09-499e-a73d-682f4454c449)
wikilabels-backups-01.wikilabels.eqiad1.wikimedia.cloud (4a85bfa1-f920-489c-badb-e8cc9f8d5692)
Thursday, 2020-09-17, 1400-17:00 UTC: cloudvirt1008 and cloudvirt1009
tracker1.lta-tracker.eqiad1.wikimedia.cloud (c5ed2a09-d95b-4658-80da-babc88f17053)
pontoon-logstash7-02.monitoring.eqiad1.wikimedia.cloud (de390caf-2515-4547-b61d-6b911869671e)
jeh-puppet.testlabs.eqiad1.wikimedia.cloud (5289f03b-1e69-4c4d-810e-1e813be40ed0)
tools-mail-02.tools.eqiad1.wikimedia.cloud (777ed08b-16db-4911-9c0d-0c8052c3f99f)
tools-clushmaster-02.tools.eqiad1.wikimedia.cloud (adcfb94d-be6f-4d90-8d16-8fa6bdbb2419)
toolsbeta-puppetmaster-03.toolsbeta.eqiad1.wikimedia.cloud (ffa2ce19-3340-48e5-889a-f6c580e2b5b4)
toolsbeta-legacy-redirector.toolsbeta.eqiad1.wikimedia.cloud (a9111792-2e9d-4910-a15e-5fdac7a5c54e)
toolsbeta-paws-worker-1002.toolsbeta.eqiad1.wikimedia.cloud (496f5c76-ee3d-49d1-a8db-20bf33b30153)
wikidata-realtime-dumps.wikidata-realtime-dumps.eqiad1.wikimedia.cloud (06179518-b478-4fdc-95a5-4b4e18b55208)
cloudstore-dev-02.cloudstore.eqiad1.wikimedia.cloud (2f897f38-e151-4873-9472-083545bcf351)
canary1009-01.cloudvirt-canary.eqiad1.wikimedia.cloud (1b675b77-444b-4e25-81a8-3eb0587a7bfe)
deployment-aqs03.deployment-prep.eqiad1.wikimedia.cloud (950f0b0a-e797-426d-80e4-a9c1b6fb9aca)
puppet-lta.lta-tracker.eqiad1.wikimedia.cloud (8b764eb9-2dca-4902-a9c5-ed54fa3fc57d)
osmit-test.osmit.eqiad1.wikimedia.cloud (eafdf7bf-7b08-48ec-b6d8-828e391799f1)
toolsbeta-paws-master-01.toolsbeta.eqiad1.wikimedia.cloud (268ff37d-f5eb-4cbc-9c28-6782f3a94f50)
[0] https://techblog.wikimedia.org/2020/08/24/ceph-distributed-vm-storage-coming...