tl;dr #1: Some VMs will have brief downtime the week of the 14th; check
the lists at the bottom of this email for affected instances and timing.
tl;dr #2: Several bastions (including secondary-bastion.wmcloud.org)
will be moved and rebooted at 14:00 UTC on Monday the 14th.
tl;dr #3: New ‘g2’ VM flavors will soon be available in Horizon, at
which point you are discouraged from using the old ‘m1’ names.
tl;dr #4: Don’t let this announcement distract from the other important
thing that’s happening: the deprecation of the .wmflabs domain for new
VMs next week
== what's happening ==
In a few weeks we will begin moving VMs to our new storage platform,
Ceph[0]. This move requires a full shutdown of each VM while it is
copied over. We'll begin by evacuating our oldest hypervisors,
cloudvirt1001-1009, two per day during the week of the 14th. Only one VM
will be moved at a time, but the timing will be unpredictable for any
given server.
To avoid unpredictable interruptions to ongoing work, I'm going to move
the following bastion hosts first, at 14:00 UTC (that's 7AM Pacific
time) on Monday the 14th. Those bastions are:
- bastion-eqiad1-02 (AKA instance-bastion-eqiad1-02.bastion.wmflabs.org
AKA instance-bastion-eqiad1-02.bastion.wmcloud.org aka
secondary.bastion.wmcloud.org AKA secondary.bastion.wmflabs.org)
- bastion-restricted-eqiad1-01
- tools-sgebastion-09
== before the move ==
If your VMs appear in the list below you should either plan a three-hour
downtime on the day listed (14:00-17:00 UTC), or contact me on IRC to
have your VM moved by hand ahead of time.
In the days preceding this move, you will see several new flavor options
appear in the Horizon interface for new VMs. They will have standard
stats-based names preceded by ‘g2’, for example
‘g2.cores1.ram2.disk80'. These new flavors will be bound to the Ceph
backend such that any new VMs created with those flavors will be run on
new hypervisors and stored on the Ceph backend. You're encouraged to
start using these new flavors as soon as they appear.
== during the move ==
Each VM will be shutdown, copied, have its flavor adjusted, and then
restarted. The total downtime will vary depending on the size of the VM
but will generally be measured in minutes rather than hours.
== after the move ==
Migrated VMs will display in Horizon with a new flavor name. The new
flavors will have the same specs (cores, ram, disk) as the former
flavor but will include Ceph-specific metadata.
A side-effect of the move is that VMs from cloudvirts1001-1009 will be
running on much newer hardware, so CPU-intensive activities should be
quite a bit faster. File IO will be moderately slower than before. If
you have a workflow that is rendered impractical by the IO changes
please open a phabricator task to discuss your options.
== eventually ==
After the dust has settled from the first round of migrations (probably
sometime during the week of the 23rd) we will disable creation of new
non-Ceph VMs. That means that the old "m1.small"-style flavors will
still display in Horizon and openstack-browser but only VMs marked with
new ‘g2’ flavor names will build successfully.
Remaining VMs (on cloudvirts1012 through 1030) will be moved to Ceph in
future weeks. Keep an eye out for emails announcing such moves.
== Schedule ==
Monday, 2020-09-14, 14:00-17:00 UTC: cloudvirt1001
cn-staging-1.centralnotice-staging.eqiad1.wikimedia.cloud
(178d85a9-cc6f-4b61-bc3a-eaedc7e4a219)
cloud-puppetmaster-04.cloudinfra.eqiad1.wikimedia.cloud
(5ea5ac40-43a7-42f3-b986-26f5803b89fc)
deploy-1002.devtools.eqiad1.wikimedia.cloud
(8ffecd5f-4de6-4c89-904a-3879612da6a5)
dumps-4.dumps.eqiad1.wikimedia.cloud (8b8e9f64-4491-4cb0-85b1-f41e06772a2c)
pontoon-puppet-01.monitoring.eqiad1.wikimedia.cloud
(43cdcd6e-7259-41b1-ae0d-8c4d8c1e2977)
ores-worker-02.ores.eqiad1.wikimedia.cloud
(0ff5345d-53b5-4c75-998d-c7ee235c469e)
ores-misc-01.ores-staging.eqiad1.wikimedia.cloud
(ee7b9541-ee56-4c83-ba6e-221a5427eb61)
wikibase-scisrc.sciencesource.eqiad1.wikimedia.cloud
(da82ab0b-1a42-42fe-b139-77586aadef80)
techblog-puppetmaster-01.techblog.eqiad1.wikimedia.cloud
(e16ebfcc-ee77-4f70-9057-dc5fa5fc900c)
tools-k8s-etcd-4.tools.eqiad1.wikimedia.cloud
(963ec21f-7976-4765-87c2-fe66e4b1538d)
toolsbeta-workflow-test.toolsbeta.eqiad1.wikimedia.cloud
(2e989cfb-5ade-4067-844a-55a9d0a967c2)
libcanada-01.wikibase-registry.eqiad1.wikimedia.cloud
(596da987-3e2d-4e1a-8574-a2a5b232104d)
Tuesday, 2020-09-15, 14:00-17:00 UTC: cloudvirt1002 and cloudvirt1003
accounts-appserver5.account-creation-assistance.eqiad1.wikimedia.cloud
(75538139-b1cd-4849-80fc-962e3c59c005)
blog-news.blog.eqiad1.wikimedia.cloud (4fe2be8a-7ac2-4d7a-99ab-058ae5de5bb5)
cloud-cumin-01.cloudinfra.eqiad1.wikimedia.cloud
(d972a73e-08aa-4696-b95b-e50efc020ade)
deployment-docker-cxserver01.deployment-prep.eqiad1.wikimedia.cloud
(60034343-58a3-4e72-8d1c-5cd7eca6da44)
deployment-docker-citoid01.deployment-prep.eqiad1.wikimedia.cloud
(e96e40b6-c46c-42fd-8490-803b0c79acd8)
deployment-docker-mathoid01.deployment-prep.eqiad1.wikimedia.cloud
(eb0903b7-9c0e-42a0-9545-f3b3e3e82b21)
google-api-proxy-03.google-api-proxy.eqiad1.wikimedia.cloud
(fc3aa841-2933-4c37-a0ab-b7d4cdac384f)
hashtags-staging.hashtags.eqiad1.wikimedia.cloud
(ff094cac-29f8-4dbc-9e32-b3e3150f6184)
soweego.soweego.eqiad1.wikimedia.cloud
(810aa430-cb07-45c2-b278-0f33a0243b5b)
striker-support01.striker.eqiad1.wikimedia.cloud
(53098a1a-5180-42ff-893c-2ba6ef90d054)
toolsbeta-mail-01.toolsbeta.eqiad1.wikimedia.cloud
(ca82223a-0332-4557-a2fb-1964207458da)
toolsbeta-test-k8s-worker-3.toolsbeta.eqiad1.wikimedia.cloud
(e3f63958-e755-49b6-b349-445283e058b8)
toolsbeta-proxy-2.toolsbeta.eqiad1.wikimedia.cloud
(fb26f477-854e-4de4-919b-c9d74152a7b1)
sdc-test-runner.wikidata-federation.eqiad1.wikimedia.cloud
(8bab2343-68db-4af7-aabd-59edb015e905)
xtools-prod07.xtools.eqiad1.wikimedia.cloud
(ad57a225-5795-42a2-8315-de77ad96d561)
commtech-bot.commtech.eqiad1.wikimedia.cloud
(3d8530fe-286f-4b35-b0af-f1499efb3adb)
puppetmaster-1001.devtools.eqiad1.wikimedia.cloud
(76db63cd-0538-4015-8e9c-564d82107044)
cx-ofb.language.eqiad1.wikimedia.cloud
(0f619b14-97b3-4133-95ad-8bbfaca88ab1)
meza-full.meza.eqiad1.wikimedia.cloud (d0e9258f-8e6d-42ba-9d59-365650dd6f59)
quarry-dev-01.quarry.eqiad1.wikimedia.cloud
(6537f032-aa7b-41ae-9e17-7060ca69e7bc)
tools-prometheus-04.tools.eqiad1.wikimedia.cloud
(2f970556-8714-48c9-bd90-b1736e4c6536)
toolsbeta-sgewebgrid-lighttpd-0901.toolsbeta.eqiad1.wikimedia.cloud
(6169633b-2972-43e7-9327-7a083459084b)
traffic-ncredir.traffic.eqiad1.wikimedia.cloud
(0478d505-62dd-4dad-9fd3-5ba84b9ba52b)
encoding04.video.eqiad1.wikimedia.cloud
(98313eae-12f7-48de-8635-9d1d3ef33b8e)
encoding05.video.eqiad1.wikimedia.cloud
(d1359499-dfb4-4303-a225-9eb52ac6ac09)
Wednesday, 2020-09-16, 14:00-17:00 UTC: cloudvirt1005, cloudvirt1007
asyncwiki-1.asyncwiki.eqiad1.wikimedia.cloud
(01d14217-5b65-4cc2-a2a1-f40e2b72a843)
deployment-wikifeeds01.deployment-prep.eqiad1.wikimedia.cloud
(eee76035-4494-4f99-9aa8-851dc11146c8)
discuss-space.discourse.eqiad1.wikimedia.cloud
(ee52ffd6-25e4-4807-bf39-1a069ae25f1e)
dumps-5.dumps.eqiad1.wikimedia.cloud (3b380b08-7f77-4c00-bc0d-f8b3f3ed24b8)
extdist-05.extdist.eqiad1.wikimedia.cloud
(80341589-cdc6-4c26-a12d-c9bd5be778aa)
gratsync.gratitude.eqiad1.wikimedia.cloud
(4a95335b-6c82-4877-900e-5cbf805401c2)
language-translate.language.eqiad1.wikimedia.cloud
(b6bcacfc-4fc7-4923-a1ec-cae9d5c11b89)
ocrtoy-web.ocrtoy.eqiad1.wikimedia.cloud
(5ff400eb-79c8-41e8-9f19-20cc2da6e632)
ores-web-06.ores.eqiad1.wikimedia.cloud
(e0c6366d-a805-47a3-82dc-de7dd1759754)
ores-worker-03.ores.eqiad1.wikimedia.cloud
(ceada679-d8e5-4f5a-96ea-30967d4a9882)
tools-docker-imagebuilder-01.tools.eqiad1.wikimedia.cloud
(90d1ae60-10f7-4596-9a1c-d63cf65ff1e0)
prod01.twl.eqiad1.wikimedia.cloud (d11750b9-3578-47e8-9821-2e3f6ddf6371)
deployment-elastic05.deployment-prep.eqiad1.wikimedia.cloud
(f0a0822d-4a84-493d-a28b-df985bf739ba)
hashtags-prod.hashtags.eqiad1.wikimedia.cloud
(35ef6320-6ceb-4fe6-93f2-e3594ed4e9db)
medbox3-iiab.iiab.eqiad1.wikimedia.cloud
(9e0f875e-5a12-49b9-b2e9-8d2943e4bb97)
language-eg.language.eqiad1.wikimedia.cloud
(e4e71a4b-7121-45ef-844b-775017691e37)
labs-bootstrapvz-jessie.openstack.eqiad1.wikimedia.cloud
(275ccc6d-3730-42c8-8a05-5293ef0db44a)
tools-elastic-3.tools.eqiad1.wikimedia.cloud
(3f7164ce-0eb5-44af-bf80-05c6bba29ec0)
tools-sgegrid-master.tools.eqiad1.wikimedia.cloud
(123949a5-8b58-40e9-97db-a52709a80d5c)
toolsbeta-sgebastion-04.toolsbeta.eqiad1.wikimedia.cloud
(58cf8c32-5af5-4313-8bcd-1d48124faf09)
utrs-database2.utrs.eqiad1.wikimedia.cloud
(89ef3df5-2a09-499e-a73d-682f4454c449)
wikilabels-backups-01.wikilabels.eqiad1.wikimedia.cloud
(4a85bfa1-f920-489c-badb-e8cc9f8d5692)
Thursday, 2020-09-17, 1400-17:00 UTC: cloudvirt1008 and cloudvirt1009
tracker1.lta-tracker.eqiad1.wikimedia.cloud
(c5ed2a09-d95b-4658-80da-babc88f17053)
pontoon-logstash7-02.monitoring.eqiad1.wikimedia.cloud
(de390caf-2515-4547-b61d-6b911869671e)
jeh-puppet.testlabs.eqiad1.wikimedia.cloud
(5289f03b-1e69-4c4d-810e-1e813be40ed0)
tools-mail-02.tools.eqiad1.wikimedia.cloud
(777ed08b-16db-4911-9c0d-0c8052c3f99f)
tools-clushmaster-02.tools.eqiad1.wikimedia.cloud
(adcfb94d-be6f-4d90-8d16-8fa6bdbb2419)
toolsbeta-puppetmaster-03.toolsbeta.eqiad1.wikimedia.cloud
(ffa2ce19-3340-48e5-889a-f6c580e2b5b4)
toolsbeta-legacy-redirector.toolsbeta.eqiad1.wikimedia.cloud
(a9111792-2e9d-4910-a15e-5fdac7a5c54e)
toolsbeta-paws-worker-1002.toolsbeta.eqiad1.wikimedia.cloud
(496f5c76-ee3d-49d1-a8db-20bf33b30153)
wikidata-realtime-dumps.wikidata-realtime-dumps.eqiad1.wikimedia.cloud
(06179518-b478-4fdc-95a5-4b4e18b55208)
cloudstore-dev-02.cloudstore.eqiad1.wikimedia.cloud
(2f897f38-e151-4873-9472-083545bcf351)
canary1009-01.cloudvirt-canary.eqiad1.wikimedia.cloud
(1b675b77-444b-4e25-81a8-3eb0587a7bfe)
deployment-aqs03.deployment-prep.eqiad1.wikimedia.cloud
(950f0b0a-e797-426d-80e4-a9c1b6fb9aca)
puppet-lta.lta-tracker.eqiad1.wikimedia.cloud
(8b764eb9-2dca-4902-a9c5-ed54fa3fc57d)
osmit-test.osmit.eqiad1.wikimedia.cloud
(eafdf7bf-7b08-48ec-b6d8-828e391799f1)
toolsbeta-paws-master-01.toolsbeta.eqiad1.wikimedia.cloud
(268ff37d-f5eb-4cbc-9c28-6782f3a94f50)
[0]
https://techblog.wikimedia.org/2020/08/24/ceph-distributed-vm-storage-comin…
Several people have asked on IRC and Phabricator if the deprecation of
*.wmflabs names for Cloud VPS instances means that the service names
used to connect to ToolsDB and the Wiki Replicas are changing. The
answer is no, these names are not changing yet. We will be replacing
these service names eventually, but for now they are staying in the
wmflabs pseudo domain.
In 2017 [0] we established new canonical service names for accessing
the shared database servers for the Cloud Services environment. Those
names are still the same today.
The naming convention for connecting to the Wiki Replica servers is:
"<wikidb>.{analytics,web}.db.svc.eqiad.wmflabs". The
*.web.db.svc.eqiad.wmflabs service names are intended for queries that
need a real-time response. The *.analytics.db.svc.eqiad.wmflabs
service names are intended for batch jobs and long running queries.
See the announcement from 2017 for more details [0].
The preferred service name for ToolsDB is tools.db.svc.eqiad.wmflabs.
[0]: https://phabricator.wikimedia.org/phame/post/view/70/new_wiki_replica_serve…
Bryan, on behalf of the Cloud Services team
--
Bryan Davis Technical Engagement Wikimedia Foundation
Principal Software Engineer Boise, ID USA
[[m:User:BDavis_(WMF)]] irc: bd808
Hi everyone!
The 2019 survey collected feedback from Toolforge project members and Cloud
VPS project administrators on how the services offered can be improved to
help their development and maintenance needs. It ran from 2019-11-25 to
2019-12-13, and had 108 participants.
Due to unforeseen circumstances this year, the analysis of the data and
publishing of the results suffered delays. We finally have been able to
publish the results, which you can read on wiki:
https://meta.wikimedia.org/wiki/Research:Cloud_Services_Annual_Survey/2019
Thanks to everyone who participated and provided input and comments. They
are very useful and instrumental in shaping the future and improvements for
the cloud services.
We will launch the 2020 Cloud Services survey in a couple of months
following a similar methodology.
Have a nice day!
tl;dr:
VMs created on or after September 8th will stop having .eqiad.wmflabs
domains, and be found only under .eqiad1.wikimedia.cloud
The whole story:
Currently cloud-vps VMs stand astride two worlds: wmflabs and
wikimedia.cloud. Here's the status quo:
- New VMs get three different DNS entries:
hostname.project.eqiad1.wikimedia.cloud, hostname.project.eqiad.wmflabs,
and hostname.eqiad.wmflabs [0]
- Reverse DNS lookups return hostnames under eqiad1.wikimedia.cloud
- VMs themselves believe (e.g. via hostname -f) that they're still under
eqiad.wmflabs
That hybrid system has done a good job maintaining backwards
compatibility, but it's a bit of a mess. In the interest of simplifying,
standardizing, and eliminating ever more uses of the term 'labs', we're
going to start phasing out the wmflabs domain name. Beginning on
September 8th, new VMs will no longer receive any naming associated with
.wmflabs [1].
- New VMs will get one DNS entry: hostname.project.eqiad1.wikimedia.cloud
- New VMs will continue to have a pointer DNS entry that refers to the
.wikimedia.cloud name
- New VMs will be assigned an internal hostname under .wikimedia.cloud
In order to avoid breaking existing systems, these changes will NOT be
applied retroactively to existing VMs. Old DNS entries will live on
until the VM is deleted and should be largely harmless. If, however,
you find yourself rewriting code in order to deal with VMs under both
domains (due to the change in hostname -f behavior), don't worry --
adjusting an old VM to identify as part of .wikimedia.cloud only
requires a simple change to /etc/hosts. I'll be available to make that
change for any project that chooses consistency over
backwards-compatibility.
[0]
https://phabricator.wikimedia.org/phame/post/view/191/new_names_for_everyone
[1] https://phabricator.wikimedia.org/T260614
https://toolsadmin.wikimedia.org has been updated to a new version of
Striker [0]. This is a feature release that carries some quality of
life improvements for tool maintainers and updates for recent changes
to Toolforge webservice URLs [1]:
* Allow self-service creation of Phabricator projects for Tools
* Allow tool maintainers to delete toolinfo records
* Improved explanation of toolinfo fields
* Use *.toolforge.org URLs when generating toolinfo data
The killer feature in this list is self-service Phabricator project
creation! This action is available from the details page for any tool
right under the prior option for creating Diffusion repositories.
Many, many thanks to Taavi Väänänen (User:Majavah) for writing the
code for Phabricator project creation. Even more thanks for their
patience in waiting for code review and deployment.
[0]: https://wikitech.wikimedia.org/wiki/Striker
[1]: https://wikitech.wikimedia.org/wiki/Toolsadmin.wikimedia.org/Deployments#20…
Bryan, on behalf of the Toolforge admin team
--
Bryan Davis Technical Engagement Wikimedia Foundation
Principal Software Engineer Boise, ID USA
[[m:User:BDavis_(WMF)]] irc: bd808