- Cloud-announce - lists.wikimedia.org

[NOTICE] Migrating Toolforge Tools On Stretch Grid To Buster Grid
by Seyram Komla Sapaty 01 Jun '22

01 Jun '22

Hello, WMCS will be migrating the remaining Toolforge tools running on Stretch[0] grid to Buster grid tomorrow(2nd June, 2022). This has become necessary as the deadline for long term support(including security updates) for Stretch comes to an end this month(June 2022). === What Do I Have To Do === Unless your tool starts failing on Buster grid, you don't have to do anything. If your tool can only work on Stretch grid, you can temporarily move it back to Stretch grid with the following commands(while you work to port it to Buster grid or kubernetes) 1. Webservice Stop and Start the tool with *-release=stretch* flag Example: *webservice stop*; *webservice start -release=stretch* 2. Jobs Add *-release=stretch* to jsub command Example: *jsub -release=stretch ...* If you need further assistance, kindly reach out to us via our communication channels[1] [0]https://wikitech.wikimedia.org/wiki/News/Toolforge_Stretch_deprecation [1] https://wikitech.wikimedia.org/wiki/Portal:Toolforge/About_Toolforge#Commun… Thanks -- Seyram Komla Sapaty Developer Advocate Wikimedia Cloud Services

1 0

Upgrading kubernetes on PAWS 2022-06-01
by Vivian Rook 25 May '22

25 May '22

On 2022-06-01 we will be upgrading kubernetes (from 1.20 to 1.21) on the PAWS cluster. This should take no action on your part. -- *Vivian Rook (They/Them)* Site Reliability Engineer Wikimedia Foundation <https://wikimediafoundation.org/>

1 0

Brief NFS outages thursday, 15:00 UTC
by Andrew Bogott 12 May '22

12 May '22

We need to perform very brief maintenance on the servers that support the DUMPS and TOOLS NFS mounts. That means that (among other things) toolforge will be almost entirely broken for about 5 minutes. I expect a graceful recovery, but in the worst case I may need to reboot toolforge hosts (including the bastions), in which case some jobs will be interrupted. This downtime will happen on Thursday, the 12th of May, at 15:00 UTC. -Andrew + the WMCS team

1 2

Toolforge Kubernetes upgrade on 2022-05-11
by Taavi Väänänen 05 May '22

05 May '22

Hi all, We'll be upgrading the Toolforge Kubernetes cluster next Wednesday (2022-05-11) starting at 16:00 UTC. As with previous updates, the expected impact is that web services and Toolforge Jobs Framework jobs may get restarted a couple of the times over the course of the few hours it takes for us to upgrade the entire cluster. Assuming your services are able to survive a restart, no action should be needed on your part. -- Taavi Väänänen (User:Majavah)

1 0

Reminder: All Debian Stretch VMs should be removed by May of 2022
by Andrew Bogott 29 Apr '22

29 Apr '22

There are still dozens of VMs in cloud-vps running Debian Stretch. All of these hosts will need to be deleted and replaced with VMs running either Buster or Bullseye in the next few months. Beginning in May we will begin to shut down Stretch instances. Please check this page for your projects, and take whatever steps are necessary to move off of Stretch: https://os-deprecation.toolforge.org/ Don't hesitate to reach out for help on IRC or mailing list if you need help with this migration. You may find Cinder volumes (https://wikitech.wikimedia.org/wiki/Help:Adding_Disk_Space_to_Cloud_VPS_ins…) especially useful for transferring data between VMs. Note that we do NOT recommend in-place OS upgrades of VMs; it is almost always better to start with a fresh host and transfer workloads over. Details about the what and why for this process can be found here: https://wikitech.wikimedia.org/wiki/News/Stretch_deprecation Here is the remaining deprecation timeline: May 1, 2022: All active Stretch VMs will be shut down (but not deleted) by WMCS admins. June 30, 2022: LTS support for Debian Stretch ends, all Stretch VMs will be deleted by WMCS admins

1 1

[IMPORTANT] login.toolforge.org now points to Buster Grid Engine
by Seyram Komla Sapaty 29 Apr '22

29 Apr '22

Hello, As WMCS continue to work on deprecating Toolforge Stretch Grid Engine[0], we are today switching login.toolforge.org to point to Buster Grid Engine[1]. This is a minor change and it's not expected to disrupt current workflows. === What Does This Change Mean For Existing Users === This change does not directly impact any existing jobs. Users will now have access to the new toolforge-jobs framework and the latest versions of our runtime environments. Note that this change will result in ssh fingerprint warnings when you first connect. The new fingerprints can be found at https://wikitech.wikimedia.org/wiki/Help:SSH_Fingerprints/login.toolforge.o… and https://wikitech.wikimedia.org/wiki/Help:SSH_Fingerprints/dev.toolforge.org. [0]: https://wikitech.wikimedia.org/wiki/News/Toolforge_Stretch_deprecation [1]: https://wikitech.wikimedia.org/wiki/News/Toolforge_Stretch_deprecation#Time… -- Seyram Komla Sapaty Developer Advocate Wikimedia Cloud Services

1 0

Upgrading Openstack on 2022-05-02
by Vivian Rook 26 Apr '22

26 Apr '22

We will be running an Openstack upgrade from Victoria to Wallaby on 2022-05-02 Some networking interruptions will occur during the upgrade at 18:00 UTC https://zonestamp.toolforge.org/1651510817 -- *Vivian Rook (They/Them)* Site Reliability Engineer Wikimedia Foundation <https://wikimediafoundation.org/>

1 0

Toolsdb outage this Tuesday, 2022-04-19 at 15:00 UTC
by Andrew Bogott 19 Apr '22

19 Apr '22

Next week (Tuesday, 2022-04-19 15:00 UTC) we will be upgrading the operating system that hosts the shared Toolsdb servers. This upgrade may take an hour or more, during which time the databases will not be available. This outage will be VERY DISRUPTIVE to many toolforge tools, as all database access will fail during the upgrade. Toolforge users may want to disable tools before the outage and/or check in to verify proper recovery after service is restored. There is likely to be a similar (or longer) outage in subsequent weeks as we also need to upgrade the database servers themselves. Toolsdb has grown to an ungainly size and can't be easily handled using standard rolling upgrade procedures; the WMCS team is in ongoing discussions about long-term solutions for this issue. In the meantime you can help us out by engaging in periodic cleanup of your database usage and dumping or dropping data that's no longer of use. -Andrew + the WMCS team

1 3

Brief service outage for some VMs
by Andrew Bogott 06 Apr '22

06 Apr '22

1 0

Network operations today 2022-04-06
by Arturo Borrero Gonzalez 06 Apr '22

06 Apr '22

Hi there, Today 2022-04-06 we're performing some network maintenance operations on Cloud VPS that could affect all cloud egress/ingress traffic, including Toolforge. The cuts, if noticeable, should last a few minutes at most. Some operations were also conducted yesterday (without this email notice), and some unexpected hiccups occurred. That's why the email today. regards. -- Arturo Borrero Gonzalez Site Reliability Engineer Wikimedia Cloud Services Wikimedia Foundation

1 0

2024

2023

2022

2021

2020

2019

2018

2017

Cloud-announce