Cloud-admin April 2023

cloud-admin@lists.wikimedia.org

4 participants
3 discussions

Help with a Wikimania presentation on how Cloud Services governance works?
by Bryan Davis 11 Aug '23

11 Aug '23

After talking with both Arturo and Birgit about things we might present at Wikimania, I came up with this abstract for a talk: Co-creating platforms and products: how the Wikimedia Cloud Services team works with the larger Wikimedia technical community to build and maintain Cloud VPS, Toolforge, Quarry, PAWS, and more Did you know that volunteers are involved in planning, building, and maintaining the Cloud VPS and Toolforge projects as co-equals with paid staff from the Wikimedia Foundation? Since the start of the "Labs" project in 2011, one of the guiding principles for WMCS projects has been improving collaboration between Foundation staff and technical volunteers. Learn more about some of the policies and practices that are used to make this collaboration possible. The submission would be under either the "governance" or "technology" tracks. I think it would work best as a panel discussion that is either "hybrid" (some folks in Singapore, some on-line) or pre-recorded video. I think this is something that folks in the community might be interested in learning a bit about. I also think it would be interesting for those of us who have participated in this process to take some time to reflect on how we have worked together in the past and how we might like to see those those processes and practices evolve in the future. To make this talk work well there should be active voices from both the paid and volunteer staff involved. Towards that end, I'm mailing the cloud-admin@ list + 4 of you that I know have been active in the past in helping with Toolforge and/or Cloud VPS admin and features work to gauge your interest in participating. Thoughts? Bryan -- Bryan Davis Technical Engagement Wikimedia Foundation Principal Software Engineer Boise, ID USA [[m:User:BDavis_(WMF)]] irc: bd808

6 14

How and when to notify about maintenance
by Andrew Bogott 17 Apr '23

17 Apr '23

In response to our recent maintenance windows we got some feedback[0] about advance notice of outages. I created this chart to provide us with some internal guidelines about when we should publicize maintenance, and how to do so: https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Maintenance_noti… You will notice that at the moment my imagination is limited to 'write to a mailing list.' I encourage people to fill in ideas on that page (or the associated talk page) about other ways we can warn people about these things. If we wind up with so many broadcast channels that it becomes impractical to actually use them all we can invest in automation. I'm also not especially committed to the brackets on that chart; I'd like to have broad categories and low standards, but edits are welcome! One thing that I want to be more mindful about is the distinction between "things that mess with our users" (e.g. quarry or horizon downtime) vs. "things that mess with our users' users" (e.g. web proxy downtime.) I'd love it if someone with better wiki-editing skills spruced up the chart to reflect that difference. -A [0] for example https://phabricator.wikimedia.org/T333477#8764263

2 1

Re: [Cloud-announce] Toolforge Kubernetes upgrade on 2023-04-03 (new date: 2023-04-10)
by Arturo Borrero Gonzalez 10 Apr '23

10 Apr '23

On 3/30/23 12:42, Arturo Borrero Gonzalez wrote: > On 3/28/23 00:13, Taavi Väänänen wrote: >> Hi, >> >> We will be upgrading the Toolforge Kubernetes cluster next Monday (2023-04-03) >> starting at around 10:00 UTC. >> >> The expected impact is that tools running on the Kubernetes cluster will get >> restarted a couple of times over the course of the few hours it takes for us >> to upgrade the entire cluster. The ability to manage tools will remain >> operational. >> >> Since the version we're upgrading to (1.22) removes a bunch of deprecated >> Kubernetes APIs, tools that use kubectl and raw Kubernetes resources directly >> may want to check that they're on the latest available versions. The vast >> majority of tools that are only using the Jobs framework and/or the webservice >> command are not affected by these changes. >> > > This has been rescheduled to Monday 2023-04-10 to leave room for the other > operations we have. > Hi there! This is happening now! https://phabricator.wikimedia.org/T286856 regards. -- Arturo Borrero Gonzalez Senior SRE / Wikimedia Cloud Services Wikimedia Foundation

1 0

2024

2023

2022

2021

2020

2019

2018

2017

Cloud-admin April 2023