Looks like my finer-grained factor tweaking was reasonably effective,
considering that we've just doubled Media Viewer traffic with the
enwiki+dewiki launch. The EL usage still high in the grand scheme of
things, but I'd like to have it running with the new values for a bit to
see how much lower I can take it. I'm going on vacation for a week tonight,
when I'm back I'll reduce the EL usage further in a new pass of studying
the data.
On Tue, Jun 3, 2014 at 2:17 PM, Nuria Ruiz <nuria(a)wikimedia.org> wrote:
> Gerco:
>
> On May 16th we lower the sampling rate of media viewer events as the event
> rate was ~170 events per second. It looks like as of a week and a half ago
> we are again at that rate.
>
> Please see:
>
> This means that Media Viewer is generating about 15 million rows a day on
> EL database, a data flow that seems quite high for our capacity to analyze
> it.
>
> Is this a mistake? Should sampling rates be lowered again?
>
> So you know, right now media viewer is sampling more than twice as much
> the rest of the teams at the foundation together. If every team sampled at
> this ratio the system will go down. Now, at this time, event logging is
> not at risk of going down but the replication is affected.
>
> Thanks,
>
> Nuria
>