On Monday next week we plan to expand the volume that stores PAWS
notebook files. PAWS will be down during the resize, possibly for as
long as an hour.
We are not planning to change the current restrictions on PAWS file
usage for individual notebooks, but this expansion will allow us to
support ongoing growth in usage without needing to purge existing old
and large-sized notebooks.
The resize will begin around 14:00 UTC.
_______________________________________________
Cloud-announce mailing list -- cloud-announce(a)lists.wikimedia.org
List information: https://lists.wikimedia.org/postorius/lists/cloud-announce.lists.wikimedia.…
Hello everyone!
A quick reminder that our *Wikimania Hackathon 2025 Newcomer Orientation
Session* is happening *today, July 12th, at 13:00 UTC*! Which starts in 2
hours 30 minutes.
Whether you're new to Wikimedia tech spaces or just want to learn how to
get involved in this year’s Hackathon, this session is for you. We’ll walk
through what to expect, how to find and join projects, and the support
available for first-time participants.
🔗 *Join us here:*https://meet.google.com/geq-hgdb-bvp (Check your timezone
<https://link.gmreg5.net/x/d?c=46978147&l=38969c3d-39d5-4340-9286-93e20fe07f…>
)
We look forward to seeing you there and helping you get ready for a great
Hackathon experience in Nairobi!
– *The Wikimania 2025 Hackathon Working Group*
(resending from the correct email account)
tl;dr:
An issue with our storage backend caused many VMs to freeze off and on
over the last 12 hours or so. This affected toolforge as well as many
cloud-vps projects.
Everything should be recovered now but if you find a misbehaving VM, a
hard reboot should resolve the issue.
Longer version:
Yesterday I began the process of upgrading many of our Ceph storage
nodes from debian 11.0 Bullseye to Debian 12.0 Bookworm. An
not-yet-understood interaction between our ceph version (16.2.15) and
Debian Bookworm produced runaway memory usage on the upgraded servers
which meant that after a few hours they began to swap and the ceph
services began to freeze intermittently.
Ceph is resilient to failures like this, but in some cases multiple ceph
services (which would normally have served as backups for each other)
froze at the same time. During partial storage failures Ceph prioritizes
data integrity over availability, and so began to make some storage
blocks unavailable and/or read-only. That erratic storage behavior in
turn caused VMs to crash or (more often) temporarily lock up.
We are now reverting the Bookworm upgrade, and will bypass the broken
combination of Bookworm and 16.x in future upgrades. Not all OSD nodes
have been rebuilt with Bullseye yet, but we have rebuilt enough that
Ceph should be able to cope with service failures on the servers that
are still pending rebuild.
As far as I can tell, most or all VMs have recovered on their own, and
we don't see any evidence of data corruption. If you find unresponsive
VMs, a 'hard reboot' from the horizon UI should resolve any remaining
issues. Please follow up with this email or with me on IRC if you find
data corruption or VMs that cannot be revived.
Full details of this incident can be found in this phabricator ticket:
https://phabricator.wikimedia.org/T399281
Thanks to Francesco, Ben Tullis, and Alexandros Kosiaris for their
speedy assistance in resolving this issue!
- Andrew
_______________________________________________
Cloud-announce mailing list -- cloud-announce(a)lists.wikimedia.org
List information: https://lists.wikimedia.org/postorius/lists/cloud-announce.lists.wikimedia.…
If you use Wiki Replicas to query wikidata termstore tables (wbt_*),
be aware that those tables are not currently getting live updates.
The Data Persistence team is working on setting up the new "x3"
section in Wiki Replicas hosts,[0] that is expected to take multiple
days to complete.
In the meantime, you can start using the following new endpoints:
- termstore.wikidatawiki.analytics.db.svc.wikimedia.cloud
- termstore.wikidatawiki.web.db.svc.wikimedia.cloud
Those endpoints are already returning data, but the data returned will
not be up-to-date. As soon as the replication of those tables is
restarted on Wiki Replicas, those endpoints will start returning live
data.
For more information, refer to the News page on Wikitech.[1]
[0] https://phabricator.wikimedia.org/T390954
[1] https://wikitech.wikimedia.org/wiki/News/2025_Wikidata_term_store_database_…
--
Francesco Negri (he/him) -- IRC: dhinus
Site Reliability Engineer, Cloud Services team
Wikimedia Foundation
_______________________________________________
Cloud-announce mailing list -- cloud-announce(a)lists.wikimedia.org
List information: https://lists.wikimedia.org/postorius/lists/cloud-announce.lists.wikimedia.…
Hello everyone,
We’re thrilled to announce that the Wikimania Hackathon 2025
<https://link.gmreg5.net/x/d?c=46891699&l=fa899007-39b7-4091-b951-bf352f2381…>
will take place onsite in Nairobi, Kenya!
The Hackathon is a space where technical contributors from across the
movement come together to collaborate and innovate. Whether you're a
developer, designer, QA tester, translator, tech writer, or just curious
about how Wikimedia technology works, this is your chance to connect,
build, and learn.
We’ll be including a Newcomer Track to support first-time participants,
including sessions and help finding projects to join. If you’re attending
as a newcomer, you won’t be alone — we’ve got resources and support to help
you get started. More details will be shared during the Newcomer
Orientation Session, happening on:
-
July 12th at 13:00 UTC
<https://link.gmreg5.net/x/d?c=46891699&l=f5f2cc01-122c-4b49-8887-46cfb5c840…>
(https://zonestamp.toolforge.org/1752325200)
-
July 19th at 16:00 UTC
<https://link.gmreg5.net/x/d?c=46891699&l=0f622452-f77b-4cbe-96cd-cf506c5372…>
(https://zonestamp.toolforge.org/1752940800)
🎯 Call for Projects Now Open!
We’re now accepting project proposals! Have an idea, task, or feature you'd
like to work on or lead? Head over to our Diff post to learn more and
submit your proposal:
https://diff.wikimedia.org/2025/06/30/call-for-projects-wikimania-hackathon…
for more details about the event, project submission, including how to
prepare and participate.
📌 Important Information:
-
If you’ve registered to attend Wikimania and plan to join the
Hackathon, please
add yourself to the Participants List:
https://wikimania.wikimedia.org/wiki/2025:Hackathon/Participants_List –
this helps us plan and connect contributors ahead of time.
-
Check out the Hackathon Resources Page
https://wikimania.wikimedia.org/wiki/2025:Hackathon/Resources for
useful links, project inspiration, onboarding guides, and more.
We’re excited to build, connect, and hack with you in Nairobi. All are
welcome — from seasoned developers to first-timers!
Best regards,
The Wikimania 2025 Hackathon Working Group
Hi everyone!
We are happy to announce that from today, you can start testing the new push-to-deploy features on Toolforge.
This has been implemented as the `components` subcommand of the `toolforge` CLI on the bastions. That will give access to creating a tool configuration for the deployment, creating the deployments themselves and creating a deploy token to be used from external CI systems (ex. GitLab).
You can find more details on the features and how to start using it on the Wikitech page [1].
Notice though that this is still under active development, so we will still be working on fixing any bugs found, and adding more features during the beta itself. This means that it might be somewhat unstable and break from time to time. So we encourage you to try it out, explore it and give feedback, but right now we don't recommend relying on it.
You can keep an eye on the Toolforge changelog[2] and the wiki page[1] for the latest features and updates on it.
There's also an approximate timeline of when we will consider it as stable in the task [3], aiming for a stable system by the end of the calendar year.
Thanks a lot!
[1] https://wikitech.wikimedia.org/wiki/Help:Toolforge/Deploy_your_tool
[2] https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Changelog
[3] https://phabricator.wikimedia.org/T393564
--
David Caro
SRE - Cloud Services
Wikimedia Foundation <https://wikimediafoundation.org/>
PGP Signature: 7180 83A2 AC8B 314F B4CE 1171 4071 C7E1 D262 69C3
"Imagine a world in which every single human being can freely share in the
sum of all knowledge. That's our commitment."
_______________________________________________
Cloud-announce mailing list -- cloud-announce(a)lists.wikimedia.org
List information: https://lists.wikimedia.org/postorius/lists/cloud-announce.lists.wikimedia.…