the replication is affected.
I take this back, as Christian mentioned there are other issues that might be affecting replication.
I'll see on what wikis/metrics we can lower the rates further while
keeping meaningful data and I'll have a config >change ready for tomorrow's swat window. I hadn't done it yet because I wanted to study the data to make less of a >wild guess this time. Sounds good, I think logging a lot to start with, studying the data and later lowering rates is what makes most sense. Thanks for the prompt response.
On Tue, Jun 3, 2014 at 6:19 PM, Gilles Dubuc gilles@wikimedia.org wrote:
Related changesets, for reference:
https://gerrit.wikimedia.org/r/#/c/134064/ https://gerrit.wikimedia.org/r/#/c/134343/ https://gerrit.wikimedia.org/r/#/c/134804/ https://gerrit.wikimedia.org/r/#/c/134837/ https://gerrit.wikimedia.org/r/#/c/134065/ https://gerrit.wikimedia.org/r/#/c/136717/
On Tue, Jun 3, 2014 at 6:15 PM, Gilles Dubuc gilles@wikimedia.org wrote:
It's now been made configurable and the sampling rate has been lowered. While we're back to a similar rate as last time, we're actually measuring close to 3 times the activity we were back then, thanks to the sampling.
I'll see on what wikis/metrics we can lower the rates further while keeping meaningful data and I'll have a config change ready for tomorrow's swat window. I hadn't done it yet because I wanted to study the data to make less of a wild guess this time.
For today's launch to enwiki/dewiki I have already set the rates to be the same as the one currently applied to commons.
On Tue, Jun 3, 2014 at 2:17 PM, Nuria Ruiz nuria@wikimedia.org wrote:
Gerco:
On May 16th we lower the sampling rate of media viewer events as the event rate was ~170 events per second. It looks like as of a week and a half ago we are again at that rate.
Please see:
https://ganglia.wikimedia.org/latest/graph.php?r=month&z=xlarge&c=Mi...
This means that Media Viewer is generating about 15 million rows a day on EL database, a data flow that seems quite high for our capacity to analyze it.
Is this a mistake? Should sampling rates be lowered again?
So you know, right now media viewer is sampling more than twice as much the rest of the teams at the foundation together. If every team sampled at this ratio the system will go down. Now, at this time, event logging is not at risk of going down but the replication is affected.
Thanks,
Nuria