We already pool counter thumbnails on a per-file level (e.g. no more than 2
processes at a time for any thumbnails having to do with an original file
at a time). Since pool counter calls cannot be nested, we can't add another
layer of pool countering based on file type grouping anything of the sort.
The biggest hole right now is that either:
a) A bunch of new files come in quickly, say 100. There could be 200
workers rendering those files (given pool counter). Many more, 50 * 100,
could also be waiting on pool counter until they timeout, tying up
thumb.php even more (though at least not using cpu or bandwith). The
throttling config change could help with this if low limits are picked.
b) Files come in more slowly but nobody views them until there are, say
100, and then they all get viewed at once for some reason. I'm not sure how
likely this is, but it's not impossible. This would require possibly
pre-rendering some thumbnails first via jobs or something in addition to
the throttling config change.
c) In any case, someone could still view a bunch of non-standard sizes and
could tie up dozens of processes for a while before getting rate limited
for a short time (and they could repeat the process). The number of threads
this could tie up is lower than (b) since rate limiting would apply before
the pool queue sizes would get as large. Still, it would use a lot of
bandwidth and cpu. If there was throttling that was weighted (instead of
using 1 for all files), then it could help.
I'm not worried about ~7000 jobs in the queue though, as it seems to just
make a backlog that doesn't take up much space.
On Tue, May 13, 2014 at 2:50 PM, Gergo Tisza <gtisza(a)wikimedia.org> wrote:
On Tue, May 13, 2014 at 2:45 PM, Gergo Tisza
<gtisza(a)wikimedia.org> wrote:
For the first problem, we can make an educated
guess of the level of
throttling required: if we want to keep the number of simultaneous
GWToolset-related scaling requests below X, that means Special:NewFiles and
Special:ListFiles should not have more than X/2 GWToolset files on them at
any given time. Those pages show the last 50 files, so GWToolset should not
upload more than X files in the time that takes normal users to upload 100
of them. I counted the number of uploads per hour on Commons on a weekday,
and there were 240 uploads in the slowest hour, which is about 25 minutes
for 100 files. so GWToolset should be limited to X files in 25 minutes, for
some value of X that ops are happy with.
This is the best we can do with the current throttling options of the job
queue, I think, but it has a lot of holes. The rate of normal uploads could
drop extremely low for a short time for some reason. New file patrollers
could be looking at the special pages with non-default settings (500 images
instead of 50). Someone could look at the associated category (200
thumbnails at a time). This is not a problem if people are continuosly
keeping watch on Special:NewFiles, because that would mean that the
thumbnails get rendered soon after the uploads; but that's an untested
assumption.
Maybe we could create scaling priority groups? Tag GWToolset-uploaded
images as belonging to the "expensive" group, then use PoolCounter to
ensure that no more than X expensive thumbnails are scaled at the same
time. That would throttle thumbnail rendering directly, instead of
throttling the upload speed and making guesses about how that translates a
throttle on rendering.
_______________________________________________
Ops mailing list
Ops(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/ops
--
-Aaron S