The idea is to add extra metrics to the dataset in a way that doesn't break what is there.
Right now all html requests are counted equally. It would be very useful to have a more sanitized count that come close to human page views. Meaning al bot requests are excluded (user agent contains bot/spider/crawler/http). also 404's might be counted separately. But this would be extra data lines in the file with different codes. Just like mobile metrics were added two years ago. To be vetted before implementation :-)
Erik
-----Original Message----- From: analytics-bounces@lists.wikimedia.org [mailto:analytics-bounces@lists.wikimedia.org] On Behalf Of Federico Leva (Nemo) Sent: Wednesday, November 14, 2012 1:18 AM To: A mailing list for the Analytics Team at WMF and everybody who has an interest in Wikipedia and analytics. Cc: fr-tech@wikimedia.org Subject: Re: [Analytics] Web Access Log Format Changes
Diederik van Liere, 14/11/2012 00:51:
Do you have any specific requests regarding dumps.wikimedia.org http://dumps.wikimedia.org?
Only that Domas' pageview logs are kept functioning and without any format change. That's by far the most used stats tool in Wikimedia land. Of course more domains are also appreciated! But that's another story.
Nemo
_______________________________________________ Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics