On Mon, Aug 9, 2010 at 11:17 PM, Tim Starling <tstarling(a)wikimedia.org> wrote:
On 10/08/10 15:16, Rob Lanphier wrote:
We have a single collection point for all of our
logging, which is
actually just a sampling of the overall traffic (designed to be
roughly one out of every 1000 hits). The process is described here:
http://wikitech.wikimedia.org/view/Squid_logging
My understanding is that this code is also involved somewhere:
http://svn.wikimedia.org/viewvc/mediawiki/trunk/webstatscollector/
...but I'm a little unclear what the relationship between that code
and code in trunk/udplog.
Maybe you should find out who wrote the relevant code and set up the
relevant infrastructure, and ask them directly. It's not difficult to
find out who it was.
Well, yes, I was hoping you'd weigh in on this thread.
At any rate,
there are a couple of problems with the way that it works:
1. Once we saturate the NIC on the logging machine, the quality of
our sampling degrades pretty rapidly. We've generally had a problem
with that over the past few months.
We haven't saturated any NICs.
Sorry, I assumed it was a NIC. There has been packet loss, from what
I understand. I'll leave it at that.
Rob