These ideas sound creative :) Also, if the problem is *storage* of
logs, rather than the server hit of *creating* the logs, then would it
not be possible to write some log analysis routines that massively
summarise the logs, then delete them? Such a thing could be run once a
day.
It would only need to store the number of hits per day to each page,
which, even with 1 million articles, would only be 4 megabytes, right?
:)
Steve
On 3/31/06, Neil Harris <usenet(a)tonal.clara.co.uk> wrote:
Brion Vibber wrote:
Andrew Gray wrote:
Is that "keep recording but ignore
them", or disable in the sense of
turn off logging totally? Just curious...
After a few months of having logs that you're not reading fill up the servers'
hard disks every few days, you turn them off. :)
-- brion vibber (brion @
pobox.com)
How about a cron job that turns logging on, then off, intermittently? Eg.
For example, on each server, have a cron job that does this:
Every 5 mins:
Is logging on?
Then: turn it off
Else: generate a random number
If it's == 0 mod 1000:
Then: turn logging on
Else: do nothing
This way, you get representative short blocks of 5 minutes of traffic,
kicking in once every three days or so on each of the 100 or so servers
at random times of the day or night. This would also suffice for gross
statistical analysis, and wouldn't require any modification of the squid
code, just a short external shell script.
Log-rotation should handle the rest and prevent the disks filling up,
since the average sampling rate would then be low enough to cope with.
-- Neil
_______________________________________________
Wikitech-l mailing list
Wikitech-l(a)wikimedia.org
http://mail.wikipedia.org/mailman/listinfo/wikitech-l