[Labs-l] Resolved: Labs and toollabs outage in progress
Andrew Bogott
abogott at wikimedia.org
Tue Oct 7 23:53:43 UTC 2014
On 10/7/14 6:50 PM, John wrote:
> Any details on what parts of toolslab went down? Ie services running
> on that virt?
I can tell you which tools instances were on virt1005:
| 120cc401-ed7a-44c5-b905-2d0eae23b6af | tools-exec-03
| 30b98f1d-1c5a-49c1-b800-f4c535addc12 | tools-exec-07
| 5cd684db-d0a6-4241-a11f-daf4c1b2f717 | tools-exec-09
| 523df61c-07f0-41ba-924d-e2b8e474b4d7 | tools-exec-cyberbot
| 96c37c36-970b-4cc7-a7ba-d1ee90a225b5 | tools-submit
| cdce426b-ef6f-47e7-96e4-bcb3647f4709 | tools-webgrid-04
| 79aeb31c-a1c1-41af-9e00-df2c7e248924 | tools-webgrid-tomcat
| 8d92c507-d253-425d-b7f4-2af3678a39ae | tools-webproxy
| 22d32e6e-608c-48a8-8423-2a1ff69fad4d | toolsbeta-exec-01
| 31e8206d-fa5c-4e62-a805-8cfb7def1f46 | toolsbeta-puppetmaster3
| 4f223286-49e0-4526-8a4e-8b64c132422a | toolsbeta-webnode-01
As for which jobs died -- that's a question for someone with better grid
skills than me :)
-A
>
> On Tuesday, October 7, 2014, Andrew Bogott <abogott at wikimedia.org
> <mailto:abogott at wikimedia.org>> wrote:
>
> On 10/7/14 5:54 PM, Andrew Bogott wrote:
>
> One of the labs servers (virt1005) has just died. Marc and I
> are investigating, but for the moment roughly 10% of labs
> instances are currently in a SHUTOFF state. Please do not
> restart these instances until I send an 'all clear' message to
> the list.
>
> Virt1005 is back up and seems to be OK. I'm now booting all
> instances on that box -- they should be up and running in a few
> minutes, but will show signs of an unceremonious reboot so you'll
> want to make sure your services are all still running properly.
>
> This crash may be related to overprovisioning on virt1005... we're
> in the process of purchasing new hardware to expand capacity and
> avoid such issues in the future.
>
> Thank you again for your patience!
>
> -Andrew
>
>
> _______________________________________________
> Labs-l mailing list
> Labs-l at lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/labs-l
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.wikimedia.org/pipermail/labs-l/attachments/20141007/8f449c4f/attachment-0001.html>
More information about the Labs-l
mailing list