Hi,
On 12/6/23 03:59, Magnus Manske via Cloud wrote:
Hi all,
I do appreciate the efforts to keep toolforge running, and that sometimes massive changes are necessary to do this, which has implications for tool maintainers.
+1, I am not sure people fully appreciate how massive of a change this is, the grid engine is one of the few remaining parts in Toolforge that actually predates it. River announced[1] SGE support for the Toolserver back in September 2009 and then in January 2013 everyone was given exactly one month(!!) to move their bots to SGE[2].
So it's a real milestone on the infrastructure side and maintainers to make it this far in getting rid of it, but it also means there's 10+ years of user familiarity, expectations and inertia towards the grid.
K8s, as it's run right now on toolforge, can not
- ...
- has very limited per-tool resources, and the webservice reduces those
even further
Just FYI if you weren't aware, the default quotas were recently raised to 8CPU + 8GB total, with a max of 3CPU + 6GB per pod. (This is also something I ran into.)
- Even the current Wikitech documentation still uses grid engine, eg
https://wikitech.wikimedia.org/wiki/Help:Toolforge/Rust https://wikitech.wikimedia.org/wiki/Help:Toolforge/Rust (I have tried, and failed, to get that running on k8s)
Sorry, this was on me, I had been pinged a while back to update it but it took me a while to figure it out for my own tools and I was still tweaking my own setup. I've updated it now, so the main "Rust" wiki page now explains how to use the jobs framework, but I still need to update the "My first Rust tool" guide. It's kind of cumbersome because k8s doesn't spawn a login shell so we have to do it manually but I guess that's for the best? If there's any other Rust-related stuff I can help with, please let me know.
Anyways, I feel in a similar boat overall, I've mostly spent the last two weeks just taking stock of my tools and rewriting some and shutting down others. I found it useful to spend a bit of time homogenizing how my tools are laid out so I could just write an ansible playbook[3] to deploy all of them (I plan to explain this in a forthcoming blog post, very soon now) in a similar fashion and apply multi-tool changes easily too.
I've already asked for the February extension for at least one of my tools, I think it's pretty reasonable for you to ask as well. I am not sure how long the lifeline can last though, the Debian Buster LTS end-of-life is coming up in June 2024 and I'm sure there's other considerations too.
[1] https://lists.wikimedia.org/hyperkitty/list/toolserver-l@lists.wikimedia.org... [2] https://lists.wikimedia.org/hyperkitty/list/toolserver-l@lists.wikimedia.org... [3] https://gitlab.wikimedia.org/legoktm/toolforge-ansible
-- Legoktm