Hi,
With Ori not responsible for statsv maintenance in an official capacity, should the Analytics team handle statsv maintenance going forward?
Ori has tried to leave it in a state that doesn't need much maintenance (read: restarts in case of issues) and is still trying to make it do so https://gerrit.wikimedia.org/r/#/c/321230/2. Which means that it shouldn't require that much actual work other than keeping an eye on it and kicking it if the things Ori has been trying to put in place don't work.
Given that we're not the only ones using statsv and considering its function, analytics seems like the home it should get. What do you think?
Would it not make more sense for the team / teams that maintain statd & graphite to maintain statv?
As far as I see statsv is part of that ecosystem and also as far as I know the analytics team doesn't really have much to do with it.
On Mon, 14 Nov 2016 at 10:40 Gilles Dubuc gilles@wikimedia.org wrote:
Hi,
With Ori not responsible for statsv maintenance in an official capacity, should the Analytics team handle statsv maintenance going forward?
Ori has tried to leave it in a state that doesn't need much maintenance (read: restarts in case of issues) and is still trying to make it do so https://gerrit.wikimedia.org/r/#/c/321230/2. Which means that it shouldn't require that much actual work other than keeping an eye on it and kicking it if the things Ori has been trying to put in place don't work.
Given that we're not the only ones using statsv and considering its function, analytics seems like the home it should get. What do you think? _______________________________________________ Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
+ops
Analytics (Otto & Luca) probably have the most experience with python kafka clients, and also are the most likely to cause statsv problems, (due to analytics kafka broker restarts, etc.). So it makes sense for us to be at least partially responsible.
On the other hand, statsv is for operational/performance metrics, so it’d be good if ops/performance folks had their eyes on it too. Can ops monitoring (Filippo?) and analytics co-own this service? I can do some work soon to update the Kafka client, but we should all know how it works and be ready and willing to modify or restart it.
I agree it makes sense to co-own statsv. I think it also needs to have a primary maintainer/point of contact if issues beside basic operational monitoring come up (e.g. general maintenance and updates). What do you think?
thanks, Filippo
On Mon, Nov 14, 2016 at 8:21 AM, Andrew Otto otto@wikimedia.org wrote:
+ops
Analytics (Otto & Luca) probably have the most experience with python kafka clients, and also are the most likely to cause statsv problems, (due to analytics kafka broker restarts, etc.). So it makes sense for us to be at least partially responsible.
On the other hand, statsv is for operational/performance metrics, so it’d be good if ops/performance folks had their eyes on it too. Can ops monitoring (Filippo?) and analytics co-own this service? I can do some work soon to update the Kafka client, but we should all know how it works and be ready and willing to modify or restart it.
Ops mailing list Ops@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/ops
I agree it makes sense to co-own statsv. I think it also needs to have a
primary maintainer/point of contact if issues beside basic operational monitoring come up (e.g. general >maintenance and updates). What do you think? Sounds fine. Also, probably performance team doesn't mind to contribute to its maintenance as long as they are not in on the hook for ops.
I have created a page to ducument operational tasks, linked from : https://wikitech.wikimedia.org/wiki/Graphite#statsv https://wikitech.wikimedia.org/wiki/Analytics/statsv
On Mon, Nov 14, 2016 at 11:57 AM, Filippo Giunchedi < fgiunchedi@wikimedia.org> wrote:
I agree it makes sense to co-own statsv. I think it also needs to have a primary maintainer/point of contact if issues beside basic operational monitoring come up (e.g. general maintenance and updates). What do you think?
thanks, Filippo
On Mon, Nov 14, 2016 at 8:21 AM, Andrew Otto otto@wikimedia.org wrote:
+ops
Analytics (Otto & Luca) probably have the most experience with python kafka clients, and also are the most likely to cause statsv problems, (due to analytics kafka broker restarts, etc.). So it makes sense for us to be at least partially responsible.
On the other hand, statsv is for operational/performance metrics, so it’d be good if ops/performance folks had their eyes on it too. Can ops monitoring (Filippo?) and analytics co-own this service? I can do some work soon to update the Kafka client, but we should all know how it works and be ready and willing to modify or restart it.
Ops mailing list Ops@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/ops
Ops mailing list Ops@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/ops