On Wed, Sep 2, 2015 at 1:21 AM, Gabriel Wicke gwicke@wikimedia.org wrote:
On Tue, Sep 1, 2015 at 5:54 PM, Gergo Tisza gtisza@wikimedia.org wrote:
Rate limiting / UA policy enforcement has to be done in Varnish, since API responses can be cached there and so the requests don't necessarily reach higher layers (and we wouldn't want to vary on user agent).
The cost / benefit trade-offs for Varnish cache hits are fairly different from those of cache misses. Especially for in-memory (frontend) hits it might overall be cheaper to send a regular response, rather than adding rate limit overheads to each cache hit.
Yeah I was mostly thinking of uncacheable API accesses. If we can cache it, we don't mind (as much) in terms of load/abuse. By having the simpler outer check in varnish, though, it takes the big load from anonymous spikes away from being handled at the applayer for those uncacheable hits.