[Labs-l] Debugging web service failure

Morten Wang nettrom at gmail.com
Tue Apr 19 19:24:20 UTC 2016


I've opened https://phabricator.wikimedia.org/T133090 to track this issue.


Cheers,
Morten

On 18 April 2016 at 20:22, Morten Wang <nettrom at gmail.com> wrote:

> I dug a little further by ssh'ing to the exec host where lighttpd is
> running and sending HTTP request directly using the port defined in
> /var/run/lighttpd/suggestbot.conf, and lighttpd is running and happy to
> serve up the pages locally (I don't actually have to ssh to the exec host,
> I can just as easily send the requests from tools-dev). To me this suggests
> the problem is somewhere in the proxy.
>
>
> Cheers,
> Morten
>
>
> On 18 April 2016 at 14:27, Morten Wang <nettrom at gmail.com> wrote:
>
>> Hi Merlijn,
>>
>> Thanks for looking into that. Yes, I've noticed that restarting the web
>> service seems to fix the problem, but only intermittently. If I now, about
>> 25 mins after you emailed, go to
>> http://tools.wmflabs.org/suggestbot/hw.html I get a 503 service
>> unavailable. This has appeared to be a consistent pattern lately, I restart
>> and shortly thereafter it is unavailable again. Which is also why it's
>> difficult to debug what's going on.
>>
>>
>> Cheers,
>> Morten
>>
>>
>> On 18 April 2016 at 14:00, Merlijn van Deen (valhallasw) <
>> valhallasw at arctus.nl> wrote:
>>
>>> Hi Morten,
>>>
>>> The 503 suggests the web proxy hit a hitch -- the web server is running
>>> on tools-webgrid-lighttpd-1415:37083 (qstat -xml, ssh to that host, ps aux
>>> | grep suggestbot; vim /var/run/lighttpd/suggestbot.conf), but the proxy is
>>> unaware of that.
>>>
>>> I have restarted your webservice job (qmod -rj 5458226), and this seems
>>> to have resolved the issue (https://tools.wmflabs.org/suggestbot/ now
>>> 404s, but that's your lighttpd webservice and not the proxy).
>>>
>>> Best,
>>> Merlijn
>>>
>>> On 18 April 2016 at 19:00, Morten Wang <nettrom at gmail.com> wrote:
>>>
>>>> I'm currently having an issue with SuggestBot's (tools.suggestbot) web
>>>> services failing after a while, resulting in a 503 (service unavailable).
>>>> The web service process on the grid appears to be running, but it also
>>>> appears to be unreachable.
>>>>
>>>> There is nothing in the error.log nor the access.log that suggests
>>>> anything happened to lighttpd, and I am unsure about what debug settings to
>>>> add to make it possible to log any errors. The fastcgi.debug setting
>>>> doesn't appear to reveal anything. In other words, I've reached the end of
>>>> my abilities to figure this out, and thus I wonder:
>>>>
>>>> What are some best practices when it comes to debugging what is going
>>>> on?
>>>>
>>>>
>>>> Cheers,
>>>> Morten
>>>>
>>>>
>>>> _______________________________________________
>>>> Labs-l mailing list
>>>> Labs-l at lists.wikimedia.org
>>>> https://lists.wikimedia.org/mailman/listinfo/labs-l
>>>>
>>>>
>>>
>>> _______________________________________________
>>> Labs-l mailing list
>>> Labs-l at lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/labs-l
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.wikimedia.org/pipermail/labs-l/attachments/20160419/82d287c3/attachment-0001.html>


More information about the Labs-l mailing list