Ashar Voultoiz wrote:
Mark Bergsma wrote:
<snip>
Can we
start installing snmpd on all servers to at least get some basic
data ? :o)
That's exactly the same data ganglia is currently monitoring, so I don't
really see the point...
So lets write ganglia scripts :o)
If we want to monitor every minute 15 services, we will have to telnet
the gmetad every 2 seconds. We could build a caching system though:
Check gmetad, cache the result for one minute, the have the nagios
plugins grep the cache instead of telneting gmetad.
I think i have an idea about how to handle that.
I wrote a perl script a while back to poll the gmond XML output from one
machine and stop or start a process on another machine based on the value of
a metric retrieved. I didn't use telnet (ick), I read from a socket and then
used an XPath module to find the metric in the XML. It's probably lying
around in my home directory somewhere if you want to look at it.
If caching is required, then adding metrics to nagios is obviously not the
same as adding metrics to ganglia. For ganglia, you run gmetric whenever a
metric changes, so you can have a loop that sets 30 metrics in each pass if
you like. You don't give it a plugin for it to invoke at its leisure, you
make your own daemon.
-- Tim Starling