Hi Mark,
Thanks for the helpful reply. Comments inline:
On Tue, Aug 10, 2010 at 2:54 AM, Mark Bergsma mark@wikimedia.org wrote:
As already stated elsewhere, we didn't really saturate any NICs, just some socket buffers. Because of the large number of configured log pipes, the software (udp2log) could not empty the socket buffers fast enough.
Based on this and IRC conversations with Tim and Domas, here's my understanding of things now (restating to make sure that I understand):
The current system is a single-threaded application that takes packets in synchronously, and spits them out to several places based on the configuration file described here: http://wikitech.wikimedia.org/view/Squid_logging
One problem that we're hitting is that the configuration of this daemon^H^H^H^H^H^Hlistener is that when it gets too bogged down with a complex configuration, it doesn't get around to emptying the socket buffer. Since it's single threaded, it's handling each of the configured logging destinations before reading the next packet. We're not CPU-bound at this point. The existing solution seems to start flaking out at 40% CPU with a complicated configuration, and is humming along at 20% with the current simplified config. The problem is that we're blocking while we fire up awk or whatever on the logging side, and overflowing the socket buffer.
A solution that Tim and others are kicking around is reworking the listener in one or more of the following ways: 1. Move to some non-blocking networking library (e.g. Boost asio, libevent) 2. Go multi-threaded
Mark, as you point out, we could go with some multicast solution if we need to split it up among boxes. As Domas points out, we could even go multi-process on the same box without really maxing it out.
The solutions we're talking about seem to solve the socket buffer problem, but it sounds like we may also need to get some clearer requirements on any new functionality that's needed. It sounds like we'll be able to get some more mileage out of the existing solution with some of the reworking described above. It's not entirely clear yet if this buys us enough capacity+capability for the increased requirements. I'll check in with Tomasz and others working on fundraiser stuff to find out more.
Rob