[Labs-l] Debugging web service failure

Morten Wang nettrom at gmail.com
Tue Apr 19 03:22:50 UTC 2016


I dug a little further by ssh'ing to the exec host where lighttpd is
running and sending HTTP request directly using the port defined in
/var/run/lighttpd/suggestbot.conf, and lighttpd is running and happy to
serve up the pages locally (I don't actually have to ssh to the exec host,
I can just as easily send the requests from tools-dev). To me this suggests
the problem is somewhere in the proxy.


Cheers,
Morten


On 18 April 2016 at 14:27, Morten Wang <nettrom at gmail.com> wrote:

> Hi Merlijn,
>
> Thanks for looking into that. Yes, I've noticed that restarting the web
> service seems to fix the problem, but only intermittently. If I now, about
> 25 mins after you emailed, go to
> http://tools.wmflabs.org/suggestbot/hw.html I get a 503 service
> unavailable. This has appeared to be a consistent pattern lately, I restart
> and shortly thereafter it is unavailable again. Which is also why it's
> difficult to debug what's going on.
>
>
> Cheers,
> Morten
>
>
> On 18 April 2016 at 14:00, Merlijn van Deen (valhallasw) <
> valhallasw at arctus.nl> wrote:
>
>> Hi Morten,
>>
>> The 503 suggests the web proxy hit a hitch -- the web server is running
>> on tools-webgrid-lighttpd-1415:37083 (qstat -xml, ssh to that host, ps aux
>> | grep suggestbot; vim /var/run/lighttpd/suggestbot.conf), but the proxy is
>> unaware of that.
>>
>> I have restarted your webservice job (qmod -rj 5458226), and this seems
>> to have resolved the issue (https://tools.wmflabs.org/suggestbot/ now
>> 404s, but that's your lighttpd webservice and not the proxy).
>>
>> Best,
>> Merlijn
>>
>> On 18 April 2016 at 19:00, Morten Wang <nettrom at gmail.com> wrote:
>>
>>> I'm currently having an issue with SuggestBot's (tools.suggestbot) web
>>> services failing after a while, resulting in a 503 (service unavailable).
>>> The web service process on the grid appears to be running, but it also
>>> appears to be unreachable.
>>>
>>> There is nothing in the error.log nor the access.log that suggests
>>> anything happened to lighttpd, and I am unsure about what debug settings to
>>> add to make it possible to log any errors. The fastcgi.debug setting
>>> doesn't appear to reveal anything. In other words, I've reached the end of
>>> my abilities to figure this out, and thus I wonder:
>>>
>>> What are some best practices when it comes to debugging what is going on?
>>>
>>>
>>> Cheers,
>>> Morten
>>>
>>>
>>> _______________________________________________
>>> Labs-l mailing list
>>> Labs-l at lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/labs-l
>>>
>>>
>>
>> _______________________________________________
>> Labs-l mailing list
>> Labs-l at lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/labs-l
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.wikimedia.org/pipermail/labs-l/attachments/20160418/824c72e9/attachment-0001.html>


More information about the Labs-l mailing list