[Foundation-l] Google Analytics test [offtopic]

Gregory Maxwell gmaxwell at wikimedia.org
Tue Oct 2 22:54:34 UTC 2007


On 10/2/07, Andrew Whitworth <wknight8111 at hotmail.com> wrote:
> The issue of enabling page counters has been brought up before on
> bugzilla. See http://bugzilla.wikimedia.org/show_bug.cgi?id=5667 for one
> example. It's not that we aren't looking for an in-house solution, but it
> seems like most options have been exhausted. I would be thrilled to learn
> that there was an in-house logging mechanism available to us to use. We
> know that one doesnt currently exist in a usable form, and that the techs
> have more important things to do then throw one together for us.

What do you want, exactly?  I know google analytics offers a lot but
I'm guessing that your primary interests can be satisfied with far
less.

We already have some basic tools that count top viewed pages. There
are active improvements happening for that kind of reporting: I just
checked in some more pagecounter aggregation code into SVN an hour or
so ago.

I'm assuming you want more than just page request counting?

The log format is documented:

https://wikitech.leuksman.com/view/Squid_log_format

Write some good aggregation/reporting scripts that are either (1) very
very fast and memory efficient or (2) able to work well on sampled
data (say 1:1000, but it should be rate agile)... or ideally both
(keep in mind that our minimum traffic level is now up to ~16k req/s).

Obviously the reporting scripts have to not disclose private data.
This can be tricky. Use your best judgement and be prepared to be told
you've gotten that wrong and need to scrub the output further.

If you do this I'll work with you to make sure that the reports you
are looking for are generated if at all reasonably possible.



More information about the foundation-l mailing list