Hello,
do you have any more details of which tile layers are getting hit? Is it low or high zoom tiles? What referes / user-agents do they come from? Is it the tiles that get served through mod_tile, the hillshading tiles or the tiles for the wiki mini atlas?
Too high load from individual clients has been an issue on many other tileservers as well. Mostly it comes from various mobile apps, that offer their users to download large areas (e.g. Germany) for offline use. These areas then cover potentially millions of tiles, that the clients then try and download as fast as the connection allows.
For that reason, the tileservers on osm.org have a significant list of user-agents that they block completely and in addition they also have an automatic rate limiting per IP. There is also a specific tile usage policy ( https://wiki.openstreetmap.org/wiki/Tile_usage_policy ) that gouverns how you are allowed to technically access the tile servers (once you have it downloaded, the use is freely gouverned by the CC-BY-SA licence)
Other tileservers like the opencyclemap, equally have restrictions and mod_tile (the apache module used to deliver tiles) has a number of features available to limit traffic. mod_tile also has a complex system to try and ensure maximum cachability of tiles while still ensuring up-to-dateness. This system can furthermore be tuned either towards fresshness or cacheability as needed.
My impression was so far this has never been an issue with the toolserver and I wasn't aware of any explicit policies of how the toolserver tiles are allowed to be accessed, so I never activated any of the limiting features. But if it is becoming an issue we can see how best to compat the issue.
At least on the munin graphs for ptolemy, I don't see much increased load. But if it is the hillshading tiles, or the WMA tiles, those don't get served through ptolemy as far as I am aware.
Kai
On 10/03/2013 03:08 PM, Marlen Caemmerer wrote:
Hello,
in the last days Toolserver experienced outages of web pages which were caused by too many queries from only a few hosts. They are using OSM images and - please dont ask me why - single IPs tend to query about 40-50 pictures per second - for minutes or hours, peaks can be worse. At some points our web server give up then. Yes sorry ;). I can proudly say that only today about 11.7 millions web queries were answered somehow.
I tried to mitigate the problem of "too many requests per IP" via blocking but it is not an option. One problem is that users of at least one portal then complain and another is that the IP addresses seem random - coming even from dial up ranges. There might be something badly wrong with cache-control headers for the images (or probably we can tweak at that point) or - I dont know what it could be.
To make the long story short - I rate limited the OSM tile delivery to 40 images per second per IP - allowed burst is 55. Users will then get a 503 error if the rate exceeds until it decreases - but delivery isnt stopped completely.
It seems to work since I have some notices which IPs were throtteled and these are IPs that have heavy usage.
I used this here to throttle: http://nginx.org/en/docs/http/ngx_http_limit_req_module.html
I dont want to have this option configured forever - I rather hope we can do something about caching or give the pictures they need to the projects themselves (I doubt we have to deliver hill shading pictures for everyone - this is Toolserver)
If anyone has an idea what to do / questions - please let me know.
Cheers Marlen/nosy
Maps-l mailing list Maps-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/maps-l