On 29 November 2010 10:11, Domas Mituzas <midom.lists(a)gmail.com> wrote:
The sampled
1/1000 squid logs can be used for statistical purposes, such as
page view stats. Someone more techy can answer that better than I can, if
the samples include IP addresses that could be used w/ geoip for geographic
analysis. (I think perhaps not)
we do aggregations on full sample, not 1/1000
1/1000 gets saved to a file for post-mortems and "wtf is going on" type of
analysis.
Ah, that explains it - I was wondering how we could get something as
precise as "three views one day, five the next" out of a 1/1000
sample! So am I right in assuming that what happens is:
1) page request comes in and is served
2) every thousandth request is sent to a separate file and logged
3) the rest are stripped of all data bar "X page requested"
4) this is kept for the pageview statistics, which are very fine-grained
The end result: one file with 0.1% of requests logged in detail and
another file with "hit counts" and no more.
--
- Andrew Gray
andrew.gray(a)dunelm.org.uk