Hi Timo,
[ adding Ori to CC, since I am not sure whether or not he is on this list, and it seems he knows more about webperf ]
On Sat, Dec 20, 2014 at 05:25:15AM +0000, Timo Tijhof wrote:
Looking at the graphs [...] reveals all data points have them plummeted straight down that day and haven't shown any activity since.
It seems 'statsd-mw-js-deprecate' webperf service on hafnium did not properly detect (or recover from) a restart of the EventLogging's zmq [1].
Not sure who maintains the 'statsd-mw-js-deprecate' webperf service [2].
But I've been bold [3], and restarted it, and it seems the service is now working again, and producing the needed numbers:
https://graphite.wikimedia.org/render/?width=640&height=480&_salt=14...
Have fun, Christian
[1] https://graphite.wikimedia.org/render/?width=640&height=480&_salt=14...
[2] It's not EventLogging. The code is directly in puppet. Most webperf commits came from Ori, but this Python file seems to have been written by you.
[3] I could find neither you nor Ori in IRC, and the service was not doing it's job anyways, and looking at the code, it looked like it's safe to restart.