Next Monday, November 25th, at around 13:00 UTC, ToolsDB will be
upgraded from MariaDB v10.4.29 to MariaDB v10.6.19. [0]
I have already created a new host "tools-db-4" that is running the new
version, and is replicating from the current primary. On Monday, I
will fail over the current primary to the new host. [1]
All connections will be dropped and the DNS will be updated to point
to the new host. Tools should automatically reconnect to the new host.
No downtime is expected but there will be a few minutes of read-only
time.
For most tools, the upgrade should be painless. However, you might
want to check the official docs listing incompatible changes
introduced in version 10.5 [2] and 10.6. [3]
If you find any issues, please let us know in the #wikimedia-cloud IRC channel.
[0] https://phabricator.wikimedia.org/T352206
[1] https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/ToolsDB#Failing_…
[2] https://mariadb.com/kb/en/upgrading-from-mariadb-10-4-to-mariadb-10-5/
[3] https://mariadb.com/kb/en/upgrading-from-mariadb-10-5-to-mariadb-10-6/
--
Francesco Negri (he/him) -- IRC: dhinus
Site Reliability Engineer, Cloud Services team
Wikimedia Foundation
On Friday I will delete the following VMs. All are running the
long-deprecated Debian Buster OS and have been shut down for several
months without user response or complaint.
Please respond directly to me if you need any of the above to be
preserved in some form. Context can be found at
https://phabricator.wikimedia.org/T331738.
centralnotice-staging:
cn-staging-3.centralnotice-staging.eqiad1.wikimedia.cloud
commons-corruption-checker:
main.commons-corruption-checker.eqiad1.wikimedia.cloud
deployment-prep:
deployment-docker-proton01.deployment-prep.eqiad1.wikimedia.cloud
deployment-echostore02.deployment-prep.eqiad1.wikimedia.cloud
deployment-maps-master01.deployment-prep.eqiad1.wikimedia.cloud
deployment-poolcounter06.deployment-prep.eqiad1.wikimedia.cloud
deployment-restbase04.deployment-prep.eqiad1.wikimedia.cloud
etytree:
etytree-a.etytree.eqiad1.wikimedia.cloud
mediawiki-vagrant:
mwv-builder-03.mediawiki-vagrant.eqiad1.wikimedia.cloud
schematreerecommender:
recommender.schematreerecommender.eqiad1.wikimedia.cloud
stress-tester.schematreerecommender.eqiad1.wikimedia.cloud
wikicommunityhealth:
backend.wikicommunityhealth.eqiad1.wikimedia.cloud
frontend.wikicommunityhealth.eqiad1.wikimedia.cloud
wikispore:
wikispore-prod.wikispore.eqiad1.wikimedia.cloud
If you are using wiki replicas to query the "wikidatawiki" database
(section s8), please be aware that we are expecting replication lag to
grow up to 10 days, because of some ongoing maintenance work [0].
Only section s8 is affected, which contains only the "wikidatawiki"
database. [1] Queries against other databases should not see any lag.
This is going to impact tools using wiki replicas, as well as queries
running on Quarry or PAWS.
You can check the current replication lag at https://replag.toolforge.org/
Thanks for your patience while we complete this maintenance work.
[0] https://phabricator.wikimedia.org/T367856
[1] https://noc.wikimedia.org/db.php#tabs-s8
--
Francesco Negri (he/him) -- IRC: dhinus
Site Reliability Engineer, Cloud Services team
Wikimedia Foundation
If your tool does not read user information from Wiki Replicas, feel
free to ignore this email.
Temporary accounts [0] are starting to be rolled out, and since
yesterday they are enabled on a few smaller wikis: Czech Wikiversity,
Igbo Wikipedia, Italian Wikiquote, Swahili Wikipedia, and
Serbo-Croatian Wikipedia. [1]
Temporary Accounts modify the ways MediaWiki stores anonymous users in
database tables. If you manage a tool that reads user information for
anonymous users, you should check the page "How should I update my
code?" [2] to find out if you need to make changes to your tool. You
can use the wikis listed above to test that your tool is working
correctly.
If you have questions or if you want to report an issue, you can file
a Phabricator task as a subtask of [3].
[0] https://www.mediawiki.org/wiki/Trust_and_Safety_Product/Temporary_Accounts
[1] https://www.mediawiki.org/wiki/Trust_and_Safety_Product/Temporary_Accounts/…
[2] https://www.mediawiki.org/wiki/Trust_and_Safety_Product/Temporary_Accounts/…
[3] https://phabricator.wikimedia.org/T378516
--
Francesco Negri (he/him) -- IRC: dhinus
Site Reliability Engineer, Cloud Services team
Wikimedia Foundation
Hi!
We will be upgrading toolforge build service tomorrow October 8th at 12:00 UTC.
There's no downtime expected, but the process this time is a bit flaky, and if
something goes wrong, it might make any new builds to fail, or prevent you from
accessing the logs of old ones until manually fixed.
I'll notify replying to this email and on irc before and after the upgrade.
Thanks!
--
David Caro
SRE - Cloud Services
Wikimedia Foundation <https://wikimediafoundation.org/>
PGP Signature: 7180 83A2 AC8B 314F B4CE 1171 4071 C7E1 D262 69C3
"Imagine a world in which every single human being can freely share in the
sum of all knowledge. That's our commitment."
Hello everyone!
We are happy to announce that toolforge jobs framework now supports
multiple replicas for continuous jobs!
There are times when you might need to run multiple instances of the
same thing (e.g. multiple processes). This change allows you to do
that using the `--replicas` option.
An example command would be something like `toolforge jobs run
multi-replica-con-job --command ./command.sh --image bookworm
--continuous --replicas 5`
The log output from each of the running instances will be aggregated
and can be streamed (if you are using the --no-filelog option) or
written to the log files.
Note: This can only be configured for continuous jobs.
Note: There is no limit to the number of replicas you can specify, but
running too many replicas can exceed the resource quota assigned to
each job/tool by default. If this happens the job may fail to run,
with out-of-quota error message displayed.
Also, a reminder that you can find this and smaller user-facing updates about
the Toolforge platform features here:
https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Changelog
Original task: https://phabricator.wikimedia.org/T341066
--
Ndibe Raymond Olisaemeka
Software Engineer II - Technical Engagement
Wikimedia Foundation <https://wikimediafoundation.org/>
Hello everyone,
We will be starting the Toolforge Kubernetes upgrade to v1.27 in a few
minutes. Like we mentioned earlier, we do not expect any downtime but
do let us know if you notice any weird behaviour that'd require our
attention.
--
Ndibe Raymond Olisaemeka
Software Engineer II - Technical Engagement
Wikimedia Foundation <https://wikimediafoundation.org/>
On Tue, Sep 10, 2024 at 6:43 PM Raymond Ndibe <rndibe(a)wikimedia.org> wrote:
>
> Hello everyone!,
>
> We will be upgrading Toolforge Kubernetes to v1.27 on Monday Sep 16
> around 1:00PM UTC.
>
> We do not expect any downtime, but some jobs and webservices may
> restart as they get shuffled around to different worker nodes. Please
> report any issues you encounter.[0]
>
> For details see: https://phabricator.wikimedia.org/T359641
>
> [0] https://wikitech.wikimedia.org/wiki/Portal:Toolforge#Communication_and_supp…
>
> Cheers,
> --
> Ndibe Raymond Olisaemeka
> Software Engineer II - Technical Engagement
>
> Wikimedia Foundation <https://wikimediafoundation.org/>