Hello all! I've been thinking more about how to move to a future of
quota-managed storage, and have written a few scraps of puppet code to
support us moving off of the old labs::lvm roles. Attached below is a
draft of an announcement email explaining the changes; writing this was
the easiest way for me to think through what the changes would feel like
to our users.
For context:
As best I can tell the only automated uses of our lvm puppet classes are
via profile::labs::lvm::srv. So, while my announcement draft tries to
acknowledge a diversity of uses cases I think there's really only one,
which is "mount all extra space in /srv". That workflow will be
replaced with a similar but slightly more cumbersome "attach a new
cinder volume and then mount all the space on that volume in /srv/"
A quick grep suggests that the most affected users workers will be
people building new VMs for toolforge, CI, and Quarry. I'm not feeling
at all bad about asking those folks to edit their puppet classes but I
am feeling kind of bad about adding a manual per-VM horizon step;
perhaps someday we'll have instrumentation for that.
I welcome your thoughts about both the wording of this email and the
overall plan that it implies. Thanks!
-A
====
With Cinder storage[0] now available and well-tested on cloud-vps, it's
time to move forward to widespread adoption. As of today, use of lvm
(e.g. via role::labs::lvm::srv) is deprecated in favor of cinder volumes
for all local storage outside of the initial 20GB allocated for for
standard OS functions.
In support of this move we'll be making several technical changes to our
infrastructure over the coming days and weeks:
- Default per-project cinder usage quotas will be increased from 10GB to
80GB
- Instance flavors with variable disk sizes will be removed. Going
forward all flavors will have a default local storage size of 20GB
regardless of associated cores or RAM.
- The old lvm puppet roles will continue to function as before on legacy
VMs. On new VMs this role will detect the lack of available space for
partitioning and report failures[1].
- Instance resizing will be enabled in the Horizon UI. This will allow
you to adjust the allocated cores and RAM allocated to a given VM
without rebuilding.
Manual setup of Cinder volumes is documented here:
https://wikitech.wikimedia.org/wiki/Help:Adding_Disk_Space_to_Cloud_VPS_ins…
If your workflow involves puppet roles that depend on labs_lvm, you will
need to alter both your puppet code and your manual processes. I've
provided some helper roles to assist. The simplest[2] is the new
'role::labs::cindermount::srv' which in most cases can be used as a
drop-in replacement for role::labs::lvm::srv: once a cinder volume is
attached to a VM that role will detect, format, and mount it as /srv.
The base class cinderutils::ensure[3] can be reused for other variations
on this theme. Please don't hesitate to contact me or other WMCS staff
for assistance with puppet patches!
In some cases the default quota of 80GB will be insufficient for a full
migration to cinder from the old flavor model, especially if you have
made use of the old 'large' or 'xlarge' sizes. When it comes time to to
rebuild these larger VMs, please open a quota request ticket and detail
what VMs (of what flavors) will be deleted and we will try to be quick
and generous with quota expansions to assist in migration.
[0] https://techblog.wikimedia.org/2021/02/05/cinder-on-cloud-vps/
[1] https://gerrit.wikimedia.org/r/c/operations/puppet/+/668567
[2] https://gerrit.wikimedia.org/r/c/operations/puppet/+/669958
[3] https://gerrit.wikimedia.org/r/c/operations/puppet/+/668757