On 10-08-10 07:16, Rob Lanphier wrote:
At any rate, there are a couple of problems with the way that it works:
- Once we saturate the NIC on the logging machine, the quality of
our sampling degrades pretty rapidly. We've generally had a problem with that over the past few months.
As already stated elsewhere, we didn't really saturate any NICs, just some socket buffers. Because of the large number of configured log pipes, the software (udp2log) could not empty the socket buffers fast enough.
If this were your typical commercial operation, the answer would be "why aren't you just logging into Streambase?" (or some other data warehousing storage solution). I'm not suggesting that we do that (or even look at any of the solutions that bill themselves as open source alternatives), since, while our needs are increasing, we still aren't planning to be anywhere near as sophisticated as a lot of data tracking orgs. Still, it's worth asking questions about our existing setup. Should we be looking optimize our existing single-box setup, extending our software to have multi-node collection, or looking at a whole new collection strategy?
Besides the ideas that are currently being kicked around of improving or rewriting the udp log collection software, there's also always the short-term, easy option of sending a multicast UDP stream, and having multiple collectors with distinct log pipes setup. E.g. one machine for the sampled logging, and another, independent machine to do all the special purpose log streams. I do like more efficient software solutions rather than throwing more iron at the problem, though. :)