Hello all! I've been thinking more about how to move to a future of quota-managed storage, and have written a few scraps of puppet code to support us moving off of the old labs::lvm roles. Attached below is a draft of an announcement email explaining the changes; writing this was the easiest way for me to think through what the changes would feel like to our users.
For context:
As best I can tell the only automated uses of our lvm puppet classes are via profile::labs::lvm::srv. So, while my announcement draft tries to acknowledge a diversity of uses cases I think there's really only one, which is "mount all extra space in /srv". That workflow will be replaced with a similar but slightly more cumbersome "attach a new cinder volume and then mount all the space on that volume in /srv/"
A quick grep suggests that the most affected users workers will be people building new VMs for toolforge, CI, and Quarry. I'm not feeling at all bad about asking those folks to edit their puppet classes but I am feeling kind of bad about adding a manual per-VM horizon step; perhaps someday we'll have instrumentation for that.
I welcome your thoughts about both the wording of this email and the overall plan that it implies. Thanks!
-A
====
With Cinder storage[0] now available and well-tested on cloud-vps, it's time to move forward to widespread adoption. As of today, use of lvm (e.g. via role::labs::lvm::srv) is deprecated in favor of cinder volumes for all local storage outside of the initial 20GB allocated for for standard OS functions.
In support of this move we'll be making several technical changes to our infrastructure over the coming days and weeks:
- Default per-project cinder usage quotas will be increased from 10GB to 80GB
- Instance flavors with variable disk sizes will be removed. Going forward all flavors will have a default local storage size of 20GB regardless of associated cores or RAM.
- The old lvm puppet roles will continue to function as before on legacy VMs. On new VMs this role will detect the lack of available space for partitioning and report failures[1].
- Instance resizing will be enabled in the Horizon UI. This will allow you to adjust the allocated cores and RAM allocated to a given VM without rebuilding.
Manual setup of Cinder volumes is documented here:
https://wikitech.wikimedia.org/wiki/Help:Adding_Disk_Space_to_Cloud_VPS_inst...
If your workflow involves puppet roles that depend on labs_lvm, you will need to alter both your puppet code and your manual processes. I've provided some helper roles to assist. The simplest[2] is the new 'role::labs::cindermount::srv' which in most cases can be used as a drop-in replacement for role::labs::lvm::srv: once a cinder volume is attached to a VM that role will detect, format, and mount it as /srv. The base class cinderutils::ensure[3] can be reused for other variations on this theme. Please don't hesitate to contact me or other WMCS staff for assistance with puppet patches!
In some cases the default quota of 80GB will be insufficient for a full migration to cinder from the old flavor model, especially if you have made use of the old 'large' or 'xlarge' sizes. When it comes time to to rebuild these larger VMs, please open a quota request ticket and detail what VMs (of what flavors) will be deleted and we will try to be quick and generous with quota expansions to assist in migration.
[0] https://techblog.wikimedia.org/2021/02/05/cinder-on-cloud-vps/
[1] https://gerrit.wikimedia.org/r/c/operations/puppet/+/668567
[2] https://gerrit.wikimedia.org/r/c/operations/puppet/+/669958
[3] https://gerrit.wikimedia.org/r/c/operations/puppet/+/668757