[Labs-l] [Labs-announce] IMPORTANT: Can your instance tolerate a reboot? (finished)
Andrew Bogott
abogott at wikimedia.org
Mon May 11 14:26:27 UTC 2015
This move is now done. Please contact me immediately if you're having
trouble with any of your instances -- there are currently backups of
most instance files remaining on the old hardware, but I'll be cleaning
that up later in the week.
-Andrew
On 4/28/15 5:10 PM, Andrew Bogott wrote:
> Executive summary:
>
> I'm going to reboot a ton of instances at random times next week. If
> you don't want me to reboot yours, email me.
>
> Don't worry about Tools or Deployment-prep; they're already on the
> 'handle with care' list.
>
> Explanation:
>
> I've been migrating instances to new hardware like crazy, and this
> morning discovered that, upon arrival on a new server, instances are
> taking up MUCH more disk space than they were. In some cases, 10 or
> 15 times as much.
>
> This turns out to be an issue with live migration and copy-on-write
> instances. The live migration code doesn't know about
> never-used-and-not-allocated-space in an instance, so when I migrate
> an xlarge instance that only used 8G of disk space (but had an
> allocated 160G of space), live-migrate copies that extra 152G of
> emptiness, thus foiling all of our attempts to safely overprovision.
>
> Cold migration (which involves shutting down an instance and copying
> the whole VM in one lump) does not have this problem. So, two things
> are going to happen:
>
> - Instances that have not yet moved to the new hardware will be cold
> migrated rather than live migrated. That means a shutdown, a
> few-minute delay, and a restart.
>
> - Large and XLarge instances that have already migrated need to be
> re-shrunk to their proper copy-on-write size. That's pretty quick,
> but also requires a stop and start.
>
> If the idea of a few minutes of downtime for your instance doesn't
> worry you, then you can do nothing. If you need your downtime
> scheduled /or/ if downtime is unacceptable, just let me know. I can
> live-migrate a few extra-precious instances and avoid downtime if
> needed. Don't hesitate to ask.
>
> None of this will start until Monday the the 4th. That gives us a
> good long while to get Tools squared away, and also gives you a good
> long time to notice this email :)
>
> Sorry this upgrade has involved so many additional complications!
>
> -Andrew
>
_______________________________________________
Labs-announce mailing list
Labs-announce at lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/labs-announce
More information about the Labs-l
mailing list