On Fri, Jun 5, 2009 at 6:38 PM, Tim Starling<tstarling(a)wikimedia.org> wrote:
Peter Gervai wrote:
Is there a possibility to write a code which
process raw squid data?
Who do I have to bribe? :-/
Yes it's possible. You just need to write a script that accepts a log
stream on stdin and builds the aggregate data from it. If you want
access to IP addresses, it needs to run on our own servers with only
anonymised data being passed on to the public.
http://wikitech.wikimedia.org/view/Squid_logging
http://wikitech.wikimedia.org/view/Squid_log_format
How much of that is really considered private? IP addresses
obviously, anything else?
I'm wondering if a cheap and dirty solution (at least for the low
traffic wikis) might be to write a script that simply scrubs the
private information and makes the rest available for whatever
applications people might want.
-Robert Rohde