[Labs-l] labmon1001 downtime scheduled for 2016-06-23 from 15:00 GMT to 23:00 GMT

Rob Halsell rhalsell at wikimedia.org
Thu Jun 23 22:00:38 UTC 2016


Please note that the expended downtime for labmon1001 has been extended to
2016-06-24 @ 17:00 GMT.  Details are noted on
https://phabricator.wikimedia.org/T137924.

Basically I'm planning to watch it migrate data the rest of this evening
(its been at it all day.)  If it completes before I go to bed, I'll bring
its services back online so we collect more data, and it becomes
available.

The labmon1001 system has been reimaged, and the 893GB of data is being
restored to the new SSDs on the system.  During the restoration process,
puppet and all services are halted on labmon1001, so it won't start
attempting to write new data during the restoration.

Apologies for the extended downtime!



On Wed, Jun 22, 2016 at 8:26 AM, Rob Halsell <rhalsell at wikimedia.org> wrote:

> Labs users,
>
> As many of you may recall, (mostly) Yuvi and (slightly) myself worked on
> labmon1001 a couple of weeks back.  Unfortunately, the work performed
> wasn't enough (as it still left the host with spinning disks) and the
> graphite service hammering them bogs down the server.  As a solution, we
> will be reinstalling the system on 2016-06-23 from 15:00 GMT to 23:00 GMT.
> I don't expect to use the majority of this window, since much of the data
> was migrated backup previously.  However, it is a possibility if we have to
> re-sync all the data (without differential).
>
> Details of this work can be viewed on
> https://phabricator.wikimedia.org/T137924.
>
> Once the SSDs are installed, their ability to handle the iops generated
> from graphite is expected to bring load levels on labmon1001 down to sane
> levels.
>
> Please let me know if there are any questions or concerns.  They can be
> raised via this email thread, or by comment on the linked phabricator task.
>
> Thanks,
>
> --
> Rob Halsell
> Operations Engineer
> Wikimedia Foundation, Inc.
> E-Mail: rhalsell at wikimedia.org
> Key fingerprint = CB1F C7E7 0FF8 5DB2 6820  9C7E 75ED 14C7
> *0245 D22A*Office: 415.839.6885 x6620
> Fax: 415.882.0495
>
>


-- 
Rob Halsell
Operations Engineer
Wikimedia Foundation, Inc.
E-Mail: rhalsell at wikimedia.org
Key fingerprint = CB1F C7E7 0FF8 5DB2 6820  9C7E 75ED 14C7
*0245 D22A*Office: 415.839.6885 x6620
Fax: 415.882.0495
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.wikimedia.org/pipermail/labs-l/attachments/20160623/c7d1b554/attachment.html>


More information about the Labs-l mailing list