The idea is to add extra metrics to the dataset in a way that doesn't break
what is there.
Right now all html requests are counted equally. It would be very useful to
have a more sanitized count that come close to human page views.
Meaning al bot requests are excluded (user agent contains
bot/spider/crawler/http). also 404's might be counted separately.
But this would be extra data lines in the file with different codes. Just
like mobile metrics were added two years ago.
To be vetted before implementation :-)
Erik
-----Original Message-----
From: analytics-bounces(a)lists.wikimedia.org
[mailto:analytics-bounces@lists.wikimedia.org] On Behalf Of Federico Leva
(Nemo)
Sent: Wednesday, November 14, 2012 1:18 AM
To: A mailing list for the Analytics Team at WMF and everybody who has an
interest in Wikipedia and analytics.
Cc: fr-tech(a)wikimedia.org
Subject: Re: [Analytics] Web Access Log Format Changes
Diederik van Liere, 14/11/2012 00:51:
Do you have any specific requests regarding
dumps.wikimedia.org
<http://dumps.wikimedia.org>?
Only that Domas' pageview logs are kept functioning and without any format
change. That's by far the most used stats tool in Wikimedia land.
Of course more domains are also appreciated! But that's another story.
Nemo
_______________________________________________
Analytics mailing list
Analytics(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics