On Monday, February 24th at 14:00 UTC, I am going to upgrade MariaDB to the
latest patch version 10.6.20. [0]
This will cause a brief downtime for ToolsDB (just a couple of minutes).
[0] https://phabricator.wikimedia.org/T385885
--
Francesco Negri (he/him) -- IRC: dhinus
Site Reliability Engineer, Cloud Services team
Wikimedia Foundation
We will be running a brief network failover test tomorrow which will
cause some network connections to reset.
There shouldn't be an long-lasting effects or any intervention needed on
the part of users, so please ignore the flap.
-Andrew
tl;dr: Minor change to DNS resolution[0] for toolforge and cloud-vps
services are coming on Monday. Should have no effect but please yell if
you see things break.
The whole story:
Back in 2020[1] we stopped associated new VMs with the .wmflabs
top-level domain; since then all VMs have been accessed via
.wikimedia.cloud instead. As of a few weeks ago, the last remaining
.wmflabs VM was deleted, so we're now cleaning up code and config that
supported that domain.
Right now if you try to resolve a stand-alone hostname (e.g. by typing
"ping mycoolserver") the resolver will try three different fqdns: first
'mycoolserver.<projectname>.eqiad1.wikimedia.cloud.' and, failing that,
'mycoolserver.<projectname>.eqiad.wmflabs.' and, failing that, the
simple name 'mycoolserver.'.
After Monday, that second fallback won't happen, so it'll either be
'mycoolserver.<projectname>.eqiad1.wikimedia.cloud.' or 'mycoolserver.'.
Most people don't use simple hostnames anyway; for those users this
change will have no effect. Some toolforge applications still use the
old-fashioned 'tools-redis' or 'tools-db' hostnames; we already have
changes in place to resolve those correctly after the change. If there
are single hostname edge cases that I don't know about and can't find,
their behavior may change or break in surprising ways.
Note that fully qualified service names (for example
gurwiki.analytics.db.svc.eqiad.wmflabs) are unaffected by this update. I
would love to eliminate them too, but it's unclear how to identify and
remove all uses so that cleanup will be left for another day.
[0] https://gerrit.wikimedia.org/r/c/operations/puppet/+/1118151
[1]
https://wikitech.wikimedia.org/wiki/News/2020_Phasing_out_the_.wmflabs_doma…
As announced on 2025-02-04 [0] there is a new timeline for the
Wikitech SUL finalization. Today I completed the planned step of
renaming and attaching all legacy local accounts that had been linked
to a Wikimedia SUL account via idm.wikimedia.org or
toolsadmin.wikimedia.org.
Anyone who has missed, or ignored, the mailing list notifications and
site notice on wikitech.wikimedia.org now has until 17:00 UTC on
2025-02-24 to use https://idm.wikimedia.org/login/mediawiki/ to claim
the association of their Developer account and SUL account.
At that time I will be making the final run of the scripts to rename
claimed accounts. Soon after Amir or I will begin the work to rename
all unclaimed accounts with a `~labswiki` suffix and attach them to
SUL to complete the SUL migration.
[0]: https://lists.wikimedia.org/hyperkitty/list/wikitech-l@lists.wikimedia.org/…
Bryan, on behalf of the Wikitech admins and everyone helping with the migration
--
Bryan Davis Wikimedia Foundation
Principal Software Engineer Boise, ID USA
[[m:User:BDavis_(WMF)]] irc: bd808
We will be upgrading toolforge Harbor service on monday.
Any attempt to build a new image or use an existing image in jobs will
likely fail during the upgrade.
long running jobs and webservices should not be affected. New
jobs/webservices using
images stored in harbor may or may not work depending on whether the images
are cached or not.
The outage will likely last for as little as 10 - 30 minutes and you should
be able to build an image
again after that.
TL;DR:
* 2025-02-10: Wikitech config will be changed to only use Wikimedia
SUL accounts.
* 2025-02-24: Final migration.
As announced back in September 2024, Wikitech is becoming a more
normal Wikimedia wiki [0].
On October 1, 2024 we made the necessary configuration changes to
detach Wikitech from the LDAP directory that stores Developer account
[1] information. Since then a number of users have been able to
recover access to their Wikitech accounts and manually connect them to
their SUL accounts. There are however a large number of accounts still
needing to be converted to SUL.
We had previously planned to make changes at the end of November 2024
and January 2025 to complete the SUL account migration for everyone.
As you might guess from the subject of this email we did not complete
that plan as hoped.
An updated timeline has been created after an examination of the
remaining work. The new timeline is:
* 2025-02-10: Wikitech config changed to only use Wikimedia SUL
accounts. All legacy local accounts that have been linked to a
Wikimedia SUL account via idm.wikimedia.org or
toolsadmin.wikimedia.org will be renamed if necessary and then
attached to the SUL account.
* 2025-02-24: Final migration. SUL account mappings added since the
prior migration will be processed. All remaining unattached accounts
will be converted to SUL accounts. This final conversion will include
renaming any local accounts that match existing SUL accounts to avoid
that name collision.
This timeline and more information can be found on Wikitech at
<https://wikitech.wikimedia.org/wiki/News/2024_Migrating_Wikitech_Account_to…>.
[0]: https://lists.wikimedia.org/hyperkitty/list/wikitech-l@lists.wikimedia.org/…
[1]: https://www.mediawiki.org/wiki/Developer_account
Bryan, on behalf of the Wikitech admins and everyone helping with the migration
--
Bryan Davis Wikimedia Foundation
Principal Software Engineer Boise, ID USA
[[m:User:BDavis_(WMF)]] irc: bd808
Next Tuesday we'll be replacing[0] the cloud gateway nodes that serve
the entire cloud-vps and toolforge network. If all goes well this will
only cause a quick connection reset but we're scheduling a window of
potential downtime in case things do not go well.
This work is in response to hardware errors that have been appearing on
one of the existing gateway hosts[1].
-Andrew + the WMCS team
[0] https://phabricator.wikimedia.org/T382356
[1] https://phabricator.wikimedia.org/T382220
Heads up that the upcoming VM reboots [0] planned for Thursday, Feb 6
will cause a brief downtime for ToolsDB. We expect the downtime to
last about 10 minutes, starting at 14:00 UTC on Thursday.
[0] https://lists.wikimedia.org/hyperkitty/list/cloud-announce@lists.wikimedia.…
--
Francesco Negri (he/him) -- IRC: dhinus
Site Reliability Engineer, Cloud Services team
Wikimedia Foundation