Cool, thanks for digging into all that for me.

One of the things that's on my list is to limit how far this digs if you give it a crazy long list of users.  I just recently discovered the totally neat more_itertools.time_limited() which I'm sure will come in handy here.


On Dec 16, 2020, at 10:50 PM, Bryan Davis <bd808@wikimedia.org> wrote:

On Wed, Dec 16, 2020 at 6:04 PM Roy Smith <roy@panix.com> wrote:

The the following (absurdly long) URL:

https://spi-tools-dev.toolforge.org/spi/timeline/MariaJaydHicky?users=Love2shop2020&users=DHA1398&users=Liberationthetruth&users=143.244.39.180&users=Tanittaking&users=Blumoone&users=Rashawnna&users=Samanda1013&users=Aisleyene&users=Samanda13&users=Inonuchics&users=LadyiTy&users=Hennacey&users=AnnushaG&users=105.6590c.D804.e13&users=Dylann19100&users=Cashmeoutsidecowboutthat&users=Priysha+%28mobile%29&users=Theemancipationofcaution&users=Nataliesha&users=Penn%C3%A9Proud&users=Morettay&users=Barnebykerry&users=Fallininlove&users=Theartoflettinggo&users=TherealRoxanna&users=LeToyaz&users=ChelseaEdit&users=Voodoopink1&users=Javine2020&users=Grimesno1fan2020&users=Latifahha&users=Fleeyonc%C3%A9&users=Kellymoat&users=MariaJaydHicky&users=Agirlcanmack&users=Priysha&users=UKurbanfan&users=Thebutterflyroom&users=BEPtoinfinity+&users=GLB113&users=Dan561313&users=Camilacabelloharmonizer&users=Greginorange&users=Mallberry&users=Takszoned&users=95.181.233.138&users=J2Oorangedrinker&users=IfthisismyusernamethenIlltakeit&users=BEPtoinfinity&users=TulisaTFB&users=Twyceasnyce&users=Sparkling_Kayla&users=Devilsadvencent&users=Annemariefan&users=Blackgyalbeauty&users=Lanewinds&users=JenniferJennyfromtheblockLopezfan&users=PrettyBrittany2k20&users=962.087.15900A.2341e&users=Likqwidfunk&users=Jayneties&users=Tulisalittlemuffin&users=Tulisa+Olive&users=RH613


gives me:

<html>
<head><title>502 Bad Gateway</title></head>
<body>
<center><h1>502 Bad Gateway</h1></center>
<hr><center>openresty/1.15.8.1</center>
</body>
</html>
<!-- a padding to disable MSIE and Chrome friendly error page -->
<!-- a padding to disable MSIE and Chrome friendly error page -->
<!-- a padding to disable MSIE and Chrome friendly error page -->
<!-- a padding to disable MSIE and Chrome friendly error page -->
<!-- a padding to disable MSIE and Chrome friendly error page -->
<!-- a padding to disable MSIE and Chrome friendly error page -->


Which isn't terribly surprising.  As far as I can tell, the request never gets as far as my app.  Is there a specific hard limit on how long a URL can be handled by the WSGI / routing layers?

There are 2 layers of nginx reverse proxies between the internet and
your tool's webservice. Both are using nginx defaults for
<http://nginx.org/r/large_client_header_buffers> which limit a single
request line (the url or any header passed as part of the request) to
8k bytes. When this limits is exceeded, nginx will return an HTTP 414
"Request-URI Too Large" or possibly an HTTP 400 "Bad Request" status
code.

If I try to visit your example URL without first being OAuth
authenticated to your tool I do not see anything in your tool's
$HOME/uwsgi.log. But... if I first visit the tool using a shorter url
and login I do see an error on a second page load with the very long
url:

[WARNING] unable to add
HTTP_X_ORIGINAL_URI=/spi/timeline/MariaJaydHicky?users=Love2shop2020&users=DHA1398&users=Liberationthetruth&users=143.244.39.180&users=Tanittaking&users=Blumoone&users=Rashawnna&users=Samanda1013&users=Aisleyene&users=Samanda13&users=Inonuchics&users=LadyiTy&users=Hennacey&users=AnnushaG&users=105.6590c.D804.e13&users=Dylann19100&users=Cashmeoutsidecowboutthat&users=Priysha+%28mobile%29&users=Theemancipationofcaution&users=Nataliesha&users=Penn%C3%A9Proud&users=Morettay&users=Barnebykerry&users=Fallininlove&users=Theartoflettinggo&users=TherealRoxanna&users=LeToyaz&users=ChelseaEdit&users=Voodoopink1&users=Javine2020&users=Grimesno1fan2020&users=Latifahha&users=Fleeyonc%C3%A9&users=Kellymoat&users=MariaJaydHicky&users=Agirlcanmack&users=Priysha&users=UKurbanfan&users=Thebutterflyroom&users=BEPtoinfinity+&users=GLB113&users=Dan561313&users=Camilacabelloharmonizer&users=Greginorange&users=Mallberry&users=Takszoned&users=95.181.233.138&users=J2Oorangedrinker&users=IfthisismyusernamethenIlltakeit&users=BEPtoinfinity&users=TulisaTFB&users=Twyceasnyce&users=Sparkling_Kayla&users=Devilsadvencent&users=Annemariefan&users=Blackgyalbeauty&users=Lanewinds&users=JenniferJennyfromtheblockLopezfan&users=PrettyBrittany2k20&users=962.087.15900A.2341e&users=Likqwidfunk&users=Jayneties&users=Tulisalittlemuffin&users=Tulisa+Olive&users=RH613
to uwsgi packet, consider increasing buffer size

A web search for "uwsgi buffer size" leads to
<https://uwsgi-docs.readthedocs.io/en/latest/ThingsToKnow.html> where
it says:

"By default uWSGI allocates a very small buffer (4096 bytes) for the
headers of each request. If you start receiving “invalid request block
size” in your logs, it could mean you need a bigger buffer. Increase
it (up to 65535) with the buffer-size option."

I added a line saying `buffer-size = 65535` to your tool's
$HOME/www/python/uwsgi.ini config file and then ran `webservice
restart`. Loading the very long URL following these actions seems to
work, although it takes a very, very long time for the tool to return
its response. The uwsgi log shows "generated 12347121 bytes in 111280
msecs".

With the larger uwsgi buffer setting, the original long url with an
unauthenticated session works as well. I do not have an explanation
for why the buffer overflow message was not logged for your tests or
my first test without authenticating.

Bryan
-- 
Bryan Davis              Technical Engagement      Wikimedia Foundation
Principal Software Engineer                               Boise, ID USA
[[m:User:BDavis_(WMF)]]                                      irc: bd808

_______________________________________________
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly labs-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud