Maintenance scripts are moving to Kubernetes

25 Sep 2024


      Hi all,
With MediaWiki at the WMF moving to Kubernetes, it's now time to start
running manual maintenance scripts there. Any time you would previously SSH
to a mwmaint host and run mwscript, follow these steps instead. The old way
will continue working for a little while, but it will be going away.
What's familiar:
Starting a maintenance script looks like this:
rzl@deploy2002:~$ mwscript-k8s --comment="T341553" -- Version.php
--wiki=enwiki
Any options for the mwscript-k8s tool, as described below, go before the --.
After the --, the first argument is the script name; everything else is
passed to the script. This is the same as you're used to passing to
mwscript.
What's different:
- Run mwscript-k8s on a deployment host, not the maintenance host. Either
deployment host will work; your job will automatically run in whichever
data center is active, so you no longer need to change hosts when there’s a
switchover.
- You don't need a tmux. By default the tool launches your maintenance
script and exits immediately, without waiting for your job to finish. If
you log out of the deployment host, your job keeps running on the
Kubernetes cluster.
- Kubernetes saves the maintenance script's output for seven days after
completion. By default, mwscript-k8s prints a kubectl command that you (or
anyone else) can paste and run to monitor the output or save it to a file.
- As a convenience, you can pass -f (--follow) to mwscript-k8s to immediately
begin tailing the script output. If you like, you can do this inside a tmux
and keep the same workflow as before. Either way, you can safely disconnect
and your script will continue running on Kubernetes.
rzl@deploy2002:~$ mwscript-k8s -f -- Version.php --wiki=testwiki
[...]
MediaWiki version: 1.43.0-wmf.24 LTS (built: 22:35, 23 September 2024)
- For scripts that take input on stdin, you can pass --attach to
mwscript-k8s, either interactively or in a pipeline.
rzl@deploy2002:~$ mwscript-k8s --attach -- shell.php --wiki=testwiki
[...]
Psy Shell v0.12.3 (PHP 7.4.33 — cli) by Justin Hileman
...
$wmgRealm
= "production"
...
rzl@deploy2002:~$ cat example_url.txt | mwscript-k8s --attach --
purgeList.php
[...]
Purging 1 urls
Done!
- Your maintenance script runs in a Docker container which will not outlive
it, so it can't save persistent files to disk. Ensure your script logs its
important output to stdout, or persists it in a database or other remote
storage.
- The --comment flag sets an optional (but encouraged) descriptive label,
such as a task number.
- Using standard kubectl commands[1][2], you can check the status, and view
the output, of your running jobs or anyone else's. (Example: `kube_env
mw-script codfw; kubectl get pod -l username=rzl`)
[1]: https://wikitech.wikimedia.org/wiki/Kubernetes/Kubectl
[2]: https://kubernetes.io/docs/reference/kubectl/quick-reference/
What's not supported yet:
- Maintenance scripts launched automatically on a timer. We're working on
migrating them -- for now, this is for one-off scripts launched by hand.
- If your job is interrupted (e.g. by hardware problems), Kubernetes can
automatically move it to another machine and restart it, babysitting it
until it completes. But we only want to do that if your job is safe to
restart. So by default, if your job is interrupted, it will stay stopped
until you restart it yourself. Soon, we'll add an option to declare "this
is idempotent, please restart it as needed" and that design is recommended
for new scripts.
- No support yet for mwscriptwikiset, foreachwiki, foreachwikiindblist,
etc, but we'll add similar functionality as flags to mwscript_k8s.
Your feedback:
Let me know by email or IRC, or on Phab (T341553
https://phabricator.wikimedia.org/T341553). If mwscript-k8s doesn't work
for you, for now you can fall back to using the mwmaint hosts as before --
but they will be going away. Please report any problems sooner rather than
later, so that we can ensure the new system meets your needs before that
happens.
Thanks,
Reuven, for Service Ops SRE

Thanks!