On Mon, Apr 21, 2014 at 12:22:36PM +0200, Gilles Dubuc wrote:
Can you clarify something: for a given set of heavyweight thumbnails that need to be rendered, assuming the the uploads have ceased, would multiple visits of Special:NewFiles in a short timeframe multiply the saturation by the amount of HTTP requests to the same thumbnail URLs? I.e. if you request the URL of a thumbnail which is currently being generated because someone else requested it, does it make the issue worse?
Varnish should do some coalescing of requests with the exact same URL and Aaron worked on wrapping thumb calls in PoolCounter on the MediaWiki side lately as well. So, in theory, no, this shouldn't happen.
Note that Special:NewFiles will show thumbnails from multiple files, though, so your browser would request multiple *different* thumbnails (different URLs) in parallel.
Second question is, how come piling on jobs doesn't just make the jobs that came last complete much later? The same kind of DoS situation could happen with someone bombarding us with HEAD requests on previously unrequested thumbnail sizes for small images, so I think that the issue isn't specific to large jobs. It's more a matter of properly queueing things up so that the imagescalers don't overload, regardless of the mix of job weight.
Yes, there are multiple vectors here that can DoS us, including the one you mention :) Note that these aren't "jobs" in the jobqueue sense; thumb generation happens in realtime, for every request that comes for a size that isn't already stored (cached) in Swift.
Faidon