[Labs-l] Yet another partial labs outage

Sun May 17 18:13:57 UTC 2015

I'll note again that the instance Andrew shut down had nothing to do with
wdq.wmflabs.org. wdq went down when when all the hosts were rebooted and
came back up not too long later.
On May 17, 2015 2:11 PM, "Petr Bena" <benapetr at gmail.com> wrote:

> Note: my previous mail was not intended to Gerard, but all people who
> complains about this :)
>
> Also tool-labs may be a little exception here, high availability is
> expected there but this tool was hosted somewhere else though.
>
> On Sun, May 17, 2015 at 8:08 PM, Petr Bena <benapetr at gmail.com> wrote:
> > I agree with Ryan on this, if it's production stuff it shouldn't run
> > on labs unless you are OK with outages. There is number of things that
> > are more or less considered production, like wm-bot or huggle's
> > components, but none of them are critical and it's not a big deal to
> > have occasional outage. If your service must be 24/7 it should be on
> > production servers and operation team needs to be trained how to
> > operate it to ensure high availability. If you fail to do that, you
> > can't blame labs people, just yourself.
> >
> > On Sun, May 17, 2015 at 9:53 AM, Gerard Meijssen
> > <gerard.meijssen at gmail.com> wrote:
> >> Hoi,
> >> Saying a similar service is not recognising the FACT that production
> grade
> >> services are running on Labs. They are. Stating that something similar
> is
> >> worked on does NOT mean that it will indeed replace what is in FACT
> used in
> >> a production manner. Because that means that it is a development
> criteria to
> >> actually replace the functionality itself.
> >>
> >> I do solute the Labs people in that they have improved the stability of
> WDQ
> >> a lot. They did puppetise the services needed for running many of the
> tools,
> >> they made additional memory available and they collaborated with Magnus
> on
> >> making the services more robust.
> >>
> >> However, functionality in the pipeline is not what is being used and,
> >> theories of what production means is not really what you can observe.
> They
> >> are theories and as such not reliable.
> >> Thanks,
> >>       GerardM
> >>
> >> On 17 May 2015 at 08:23, Tim Landscheidt <tim at tim-landscheidt.de>
> wrote:
> >>>
> >>> Ryan Lane <rlane32 at gmail.com> wrote:
> >>>
> >>> > [WDQ]
> >>>
> >>> > If it's production-ish, it should likely either be moved to
> production
> >>> > or
> >>> > you should put a bit of effort into making it work across multiple
> >>> > instances. The ideal goal is for services to be stateless, with their
> >>> > state
> >>> > living in databases that are also split across instances. It's best
> to
> >>> > have
> >>> > the service config managed (ideally puppetized since it's what
> wikimedia
> >>> > uses) so that a loss of an instance is only a brief inconvenience.
> >>>
> >>> There are efforts to deploy a similar service with
> >>> https://www.mediawiki.org/wiki/Wikibase/Indexing (Phabrica-
> >>> tor project at
> >>> https://phabricator.wikimedia.org/tag/wikidata-query-service/).
> >>>
> >>> Tim
> >>>
> >>>
> >>> _______________________________________________
> >>> Labs-l mailing list
> >>> Labs-l at lists.wikimedia.org
> >>> https://lists.wikimedia.org/mailman/listinfo/labs-l
> >>
> >>
> >>
> >> _______________________________________________
> >> Labs-l mailing list
> >> Labs-l at lists.wikimedia.org
> >> https://lists.wikimedia.org/mailman/listinfo/labs-l
> >>
>
> _______________________________________________
> Labs-l mailing list
> Labs-l at lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/labs-l
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.wikimedia.org/pipermail/labs-l/attachments/20150517/6863eee9/attachment.html>