It's now been made configurable and the sampling rate has been lowered. While we're back to a similar rate as last time, we're actually measuring close to 3 times the activity we were back then, thanks to the sampling.

I'll see on what wikis/metrics we can lower the rates further while keeping meaningful data and I'll have a config change ready for tomorrow's swat window. I hadn't done it yet because I wanted to study the data to make less of a wild guess this time.

For today's launch to enwiki/dewiki I have already set the rates to be the same as the one currently applied to commons.


On Tue, Jun 3, 2014 at 2:17 PM, Nuria Ruiz <nuria@wikimedia.org> wrote:
Gerco:

On May 16th we lower the sampling rate of media viewer events as the event rate was ~170 events per second. It looks like as of  a week and a half ago we are again at that rate. 

Please see:

https://ganglia.wikimedia.org/latest/graph.php?r=month&z=xlarge&c=Miscellaneous+eqiad&h=vanadium.eqiad.wmnet&jr=&js=&v=645622124&m=eventlogging_all-events&vl=events&ti=all-events


This means that Media Viewer is generating about 15 million rows a day on EL database, a data flow that seems quite high for our capacity to analyze it. 

Is this a mistake? Should sampling rates be lowered again? 

So you know, right now media viewer is sampling more than twice as much the rest of the teams at the foundation together. If every team sampled at this ratio the system will go down. Now, at this time,  event logging is not at risk of going down but the replication is affected.

Thanks,

Nuria