On 10/08/10 15:16, Rob Lanphier wrote:
We have a single collection point for all of our
logging, which is
actually just a sampling of the overall traffic (designed to be
roughly one out of every 1000 hits). The process is described here:
http://wikitech.wikimedia.org/view/Squid_logging
My understanding is that this code is also involved somewhere:
http://svn.wikimedia.org/viewvc/mediawiki/trunk/webstatscollector/
...but I'm a little unclear what the relationship between that code
and code in trunk/udplog.
Maybe you should find out who wrote the relevant code and set up the
relevant infrastructure, and ask them directly. It's not difficult to
find out who it was.
At any rate, there are a couple of problems with the
way that it works:
1. Once we saturate the NIC on the logging machine, the quality of
our sampling degrades pretty rapidly. We've generally had a problem
with that over the past few months.
We haven't saturated any NICs.
-- Tim Starling