zcat /a/squid/archive/mobile/mobile-sampled-100.tsv.log-20130421.gz | awk '{ print $5 }' | sort | uniq -c | sort -nr | head
   7706 208.80.154.x
   7523 208.80.154.x
   7467 208.80.154.x
   7133 208.80.154.x

I'm running a job to learn more about the sessions with the most pageviews so hopefully the mystery will be solved soon, but afaik the isPageview filter excludes hits that match our CIDR ranges (and it has tests). I'll certainly double-check it, as it's used everywhere. (Also, this dataset comes from the mobile varnishes, not the squids, fwiw.)

--
David Schoonover
dsc@wikimedia.org


On Tue, Apr 23, 2013 at 2:29 PM, Ori Livneh <ori@wikimedia.org> wrote:



On Tuesday, April 23, 2013 at 11:13 AM, Matthew Walker wrote:

> > Max Pageviews in one Session: 141,882
>

zcat /a/squid/archive/mobile/mobile-sampled-100.tsv.log-20130421.gz | awk '{ print $5 }' | sort | uniq -c | sort -nr | head
   7706 208.80.154.x
   7523 208.80.154.x
   7467 208.80.154.x
   7133 208.80.154.x

(I censored the last octet on the off-chance that it is sensitive.) These are internal IPs. If they haven't been filtered out, they're probably causing the huge page view count.


--
Ori Livneh



_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics