On Tue, May 13, 2014 at 2:45 PM, Gergo Tisza gtisza@wikimedia.org wrote:
For the first problem, we can make an educated guess of the level of throttling required: if we want to keep the number of simultaneous GWToolset-related scaling requests below X, that means Special:NewFiles and Special:ListFiles should not have more than X/2 GWToolset files on them at any given time. Those pages show the last 50 files, so GWToolset should not upload more than X files in the time that takes normal users to upload 100 of them. I counted the number of uploads per hour on Commons on a weekday, and there were 240 uploads in the slowest hour, which is about 25 minutes for 100 files. so GWToolset should be limited to X files in 25 minutes, for some value of X that ops are happy with.
This is the best we can do with the current throttling options of the job queue, I think, but it has a lot of holes. The rate of normal uploads could drop extremely low for a short time for some reason. New file patrollers could be looking at the special pages with non-default settings (500 images instead of 50). Someone could look at the associated category (200 thumbnails at a time). This is not a problem if people are continuosly keeping watch on Special:NewFiles, because that would mean that the thumbnails get rendered soon after the uploads; but that's an untested assumption.
Maybe we could create scaling priority groups? Tag GWToolset-uploaded images as belonging to the "expensive" group, then use PoolCounter to ensure that no more than X expensive thumbnails are scaled at the same time. That would throttle thumbnail rendering directly, instead of throttling the upload speed and making guesses about how that translates a throttle on rendering.