Thank you for starting the discussion! I'll start by saying that while I'm not convinced by the arguments for moving everything from OpenStack to containers, I tend to agree that OpenStack does have some features we should reconsider whether we want to offer them. For example, my current feeling is that the half-baked PostgreSQL support in Trove is a net negative considering incidents like [0] which both take a significant amount of admin time to troubleshoot and resolve and as a result are causing quite a bit of downtime for our users.
[0]: https://phabricator.wikimedia.org/T355138
First, I am highly sceptical of the Kubernetes cluster-in-cluster solutions you mention. As far as we should be concerned as infrastructure operators, having the ability to run arbitrary workloads in a Kubernetes cluster is equivalent to full root on all the worker nodes. (The most obvious example is the ability to run a pod as root that mounts /etc/passwd or a similarly privileged file as read-write as a hostPath, but that's far from the only way to break out of the pod sandbox.) We solve this in Toolforge with PSPs that severely restrict the configurations of pods that are allowed to run. I am having a hard time imagining how to implement a shared cluster without those Toolforge-level restrictions that would keep that strong tenant isolation that at least I consider a hard requirement for any of our offerings.
From my experience maintaining the K8s cluster in Toolforge, I can say that upgrading that cluster is consistently one of the most stressful things I do around here, and that's with the majority of the workload being managed by us. Yes, the process is rather simple now, but compared to OpenStack, Kubernetes has a very active role in the continued running of already existing workloads. The blast radius of a Kubernetes upgrade going wrong is much higher than, say, a Nova upgrade going wrong which would affect starting new VMs and stopping existing ones but generally would not touch VMs already running in libvirt. So far there's only been one (if I remember correctly) upgrade-related major service degradation that made it to live Toolforge[1], but I would credit that more to the slow speed of our upgrades and the countless number of hours I've spent reading changelogs and docs and testing the upgrades locally and in toolsbeta. And as a reminder, we're currently about two years behind Kubernetes releases and don't seem to be catching up even after upstream reduced from 4 1.x releases a year to 3.
[1]: https://phabricator.wikimedia.org/T308189
The fact that Toolforge K8s runs in VMs is very helpful due to the flexibility it gives - if I want to test a particular combination of worker configuration I can currently just spin up a VM instead of having to figure out where to find hardware for that (like I'm currently having to think for the OVS tests). Also, many projects are just too small to need dedicated hardware. For example, LibUp[2], a project I recently became involved with and that doesn't neatly fit into Toolforge at the moment, currently uses about 10 vCPUs and about 20 GiB of RAM - we can stuff about maybe 10-20 of projects of that size onto 1U of rack space on a modern high-spec virtualization node so giving it dedicated hardware to run a K8s cluster on top of just does not make sense. And yes, you could replace OpenStack with something like Ganeti with much less management overhead, but my gut feeling is that Nova and Neutron and the other 'core' services involved in running traditional VMs are relatively well-behaving compared to the newer stuff, and also give us useful features (like multi-tenancy, and instance isolation from the management/wikiland networks) that we'd have to invent ourselves if we ditched OpenStack.
[2]: https://www.mediawiki.org/wiki/LibUp
And finally, I disagree with the statement that maintaining a Linux server is more difficult than running something in Kubernetes (even if someone else maintains the cluster itself). At least in my mind a modern Kubernetes deployment has a million more moving parts than a simple Linux server where our users can SSH to and apt-get install a web server to run their app with.
Taavi
On Thu, Feb 29, 2024 at 7:12 PM Arturo Borrero Gonzalez aborrero@wikimedia.org wrote:
Hi there,
Last year, we starting evaluating how we could refresh the way we relate (deploy, maintain, upgrade) our Openstack deployment for Cloud VPS [0].
One of the most compelling options we found was to run Openstack inside Kubernetes, using an upstream project called openstack-helm.
But... What if we stopped doing Openstack at all?
To clarify, the base idea I had is:
- deploy Kubernetes to a bunch of hosts in one of our Wikimedia datacenters
** we know how to do it! ** this would be the base, undercloud, or bedrock, whatever.
- deploy ceph next to k8s (maybe, inside even?)
** ceph would remain the preferred network storage solution
- deploy some kind of k8s multiplexing tech
** example: https://www.vcluster.com/ but there could be others ** using this create a dedicated k8s cluster for each project, for example: toolforge/toolsbeta/etc
- Inside this new VM-less toolforge, we can retain pretty much the same
functionalities as today: ** a container listening on 22/tcp with kubectl & toolforge cli installed can be the login bastion ** NFS server can be run on a container, using ceph ** toolsDB can be run on a container. Can't it? Or maybe replace it with other k8s-native solution
- If we need any of the native openstack components, for example Keystone or
Swift we may run them on an standalone fashion inside k8s.
- We already have some base infrastructure (and knowledge) that would support
this model. We have cloudlbs, cloudgw, we know how to do ceph, etc.
- And finally, and most important: the community. The main question could be:
** Is there any software running on Cloud VPS virtual machines that cannot run on a container in kubernetes?
I wanted to start this email hoping that I would collect a list of use cases, blockers, and strong opinions about why running Openstack is important (or not). I'm pretty sure I'm overlooking some important thing.
I plan to document all this on wikitech, and/or maybe phabricator.
You may ask: and why stop doing openstack? I will answer that in a different email to keep this one as short as possible.
Looking forward to your counter-arguments. Thanks!
regards.
[0] https://wikitech.wikimedia.org/wiki/Wikimedia_Cloud_Services_team/Enhancemen... _______________________________________________ Cloud-admin mailing list -- cloud-admin@lists.wikimedia.org List information: https://lists.wikimedia.org/postorius/lists/cloud-admin.lists.wikimedia.org/