Hi,

Well, there was one surprise, and it did explode! It was probably very close to a full outage in codfw. We are still in the process of documenting everything and we 'll be publishing a full incident report, but the TL;DR is that every single mwscript-k8s invocation that is happening in the for loop creates an entire helm release, including all k8s resources. This is by design, but it did have an unforeseen consequence and that is that close to 2k Calico Network Policies were created (including a ton of other resources, which would create their own set of problems), which meant all Calico components in k8s had to gradually react to the increasing number of those, which ended up hitting resources limits for some of those, which led to throttles and then to failures and into a slowly cascading outage that was putting hardware nodes one by one out of rotation. The last couple of hours were interesting for some of us, I can tell you that.

We are already working on plans (we got enough action items already, including amending the design) on how to fix this, but in the meantime, I 'd like to request that until we provide an update that we 've solved this, don't spawn mwscript-k8s in a for loop or anything similar. You can continue working of course with the tool to get acquainted with it, find bugs etc, just please don't spawn hundreds or even worse thousands of invocations.

Brooke, I had to kill your bash shell on deploy2002 doing the transcodes. I am sorry about that, but despite attaching to your screen I didn't manage to find how to stop it (it didn't respond to any of the usual control sequences and shell job controls) and I didn't want to risk 1 more outage (which would probably happen once the resources were reaching some critical number).

On Wed, Oct 9, 2024 at 6:16 AM Reuven Lazarus <rlazarus@wikimedia.org> wrote:
Great to hear, thanks!

As a side note for others, to highlight something Brooke said in passing: Wrapping mwscript-k8s in a bash for loop is a fine idea, as long as you're running it with --follow or --attach. In that case, each mwscript-k8s invocation will keep running to monitor the job's output, and will terminate when the job terminates. One job will run at a time, which is what you expect.

Without --follow or --attach, mwscript-k8s is just the launcher: it kicks off your job, then terminates immediately. Your for loop will rapidfire launch all the jobs one after another, which means hundreds of 'em might be executing simultaneously, and that might not be what you had in mind. If your job involves expensive DB operations, it really might not be what you had in mind.

First-class dblist support will indeed make that pitfall easier to avoid. In the meantime there's nothing wrong with using a for loop, and it's what I'd do too -- but since this is a new system and nobody has well-honed intuition for it yet, I wanted to draw everyone's eye to that distinction.

On Tue, Oct 8, 2024 at 4:37 PM Brooke Vibber <bvibber@wikimedia.org> wrote:
I'm starting some batch maintenance of video transcodes so I'm exercising the new k8s-based maint script system on TMH's requeueTranscodes.php; good news: no surprises so far, everything's working just fine. :D

Since I'm running the same scripts over multiple wikis I went ahead and manually wrapped them in a bash for loop so it's submitting one job at a time out of all.dblist, using a screen session for the wrapper loop and tailing the logs to the session so they don't all smash out at once, and a second manually-started run for Commons. :)

First-class support for running over a dblist will be a very welcome improvement, and should be pretty straightforward! Good work everybody. :D

The longest job (Commons) might take a couple days to run, so we'll see if anything explodes later! hehe

-- brooke

On Wed, Sep 25, 2024 at 8:11 PM Reuven Lazarus <rlazarus@wikimedia.org> wrote:

Hi all,


With MediaWiki at the WMF moving to Kubernetes, it's now time to start running manual maintenance scripts there. Any time you would previously SSH to a mwmaint host and run mwscript, follow these steps instead. The old way will continue working for a little while, but it will be going away.



What's familiar:


Starting a maintenance script looks like this:


  rzl@deploy2002:~$ mwscript-k8s --comment="T341553" -- Version.php --wiki=enwiki


Any options for the mwscript-k8s tool, as described below, go before the --.


After the --, the first argument is the script name; everything else is passed to the script. This is the same as you're used to passing to mwscript.



What's different:


- Run mwscript-k8s on a deployment host, not the maintenance host. Either deployment host will work; your job will automatically run in whichever data center is active, so you no longer need to change hosts when there’s a switchover.


- You don't need a tmux. By default the tool launches your maintenance script and exits immediately, without waiting for your job to finish. If you log out of the deployment host, your job keeps running on the Kubernetes cluster.


- Kubernetes saves the maintenance script's output for seven days after completion. By default, mwscript-k8s prints a kubectl command that you (or anyone else) can paste and run to monitor the output or save it to a file.


- As a convenience, you can pass -f (--follow) to mwscript-k8s to immediately begin tailing the script output. If you like, you can do this inside a tmux and keep the same workflow as before. Either way, you can safely disconnect and your script will continue running on Kubernetes.


  rzl@deploy2002:~$ mwscript-k8s -f -- Version.php --wiki=testwiki

  [...]

  MediaWiki version: 1.43.0-wmf.24 LTS (built: 22:35, 23 September 2024)


- For scripts that take input on stdin, you can pass --attach to mwscript-k8s, either interactively or in a pipeline.


  rzl@deploy2002:~$ mwscript-k8s --attach -- shell.php --wiki=testwiki

  [...]

  Psy Shell v0.12.3 (PHP 7.4.33 — cli) by Justin Hileman

  > $wmgRealm

  = "production"

  >


  rzl@deploy2002:~$ cat example_url.txt | mwscript-k8s --attach -- purgeList.php

  [...]

  Purging 1 urls

  Done!


- Your maintenance script runs in a Docker container which will not outlive it, so it can't save persistent files to disk. Ensure your script logs its important output to stdout, or persists it in a database or other remote storage.


- The --comment flag sets an optional (but encouraged) descriptive label, such as a task number.


- Using standard kubectl commands[1][2], you can check the status, and view the output, of your running jobs or anyone else's. (Example: `kube_env mw-script codfw; kubectl get pod -l username=rzl`)


[1]: https://wikitech.wikimedia.org/wiki/Kubernetes/Kubectl 

[2]: https://kubernetes.io/docs/reference/kubectl/quick-reference/ 



What's not supported yet:


- Maintenance scripts launched automatically on a timer. We're working on migrating them -- for now, this is for one-off scripts launched by hand.


- If your job is interrupted (e.g. by hardware problems), Kubernetes can automatically move it to another machine and restart it, babysitting it until it completes. But we only want to do that if your job is safe to restart. So by default, if your job is interrupted, it will stay stopped until you restart it yourself. Soon, we'll add an option to declare "this is idempotent, please restart it as needed" and that design is recommended for new scripts.


- No support yet for mwscriptwikiset, foreachwiki, foreachwikiindblist, etc, but we'll add similar functionality as flags to mwscript_k8s.



Your feedback:


Let me know by email or IRC, or on Phab (T341553). If mwscript-k8s doesn't work for you, for now you can fall back to using the mwmaint hosts as before -- but they will be going away. Please report any problems sooner rather than later, so that we can ensure the new system meets your needs before that happens.


Thanks,

Reuven, for Service Ops SRE

_______________________________________________
Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
To unsubscribe send an email to wikitech-l-leave@lists.wikimedia.org
https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
_______________________________________________
Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
To unsubscribe send an email to wikitech-l-leave@lists.wikimedia.org
https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/


--
Reuven Lazarus (he/him)
Staff Site Reliability Engineer
Wikimedia Foundation
_______________________________________________
Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
To unsubscribe send an email to wikitech-l-leave@lists.wikimedia.org
https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/


--
Alexandros Kosiaris
Principal Site Reliability Engineer
Wikimedia Foundation