>Seems like using an unknown useragent string might be a better proxy.
We already do bot filtering using user agent strings. Please see: 
https://github.com/wikimedia/analytics-refinery-source/blob/master/refinery-core/src/main/java/org/wikimedia/analytics/refinery/core/Webrequest.java#L56

On Wed, Oct 21, 2015 at 12:58 PM, Ryan Kaldari <rkaldari@wikimedia.org> wrote:
I was under the impression that most of the MediaWiki bot frameworks do accept cookies, but I imagine many of the home-made bots don't. Seems like using an unknown useragent string might be a better proxy.

On Wed, Oct 21, 2015 at 1:49 PM, Nuria Ruiz <nuria@wikimedia.org> wrote:
>What was the motivation for this change? Just looking for possible automata?
Right.The motivation was to see if the absence of cookies works as a cheap proxy to identify robots. It is a pretty easy change to make that might help us quite a bit, we shall update the list when we have some data.

On Wed, Oct 21, 2015 at 12:47 PM, Oliver Keyes <okeyes@wikimedia.org> wrote:
What was the motivation for this change? Just looking for possible automata?

On 21 October 2015 at 15:38, Nuria Ruiz <nuria@wikimedia.org> wrote:
> Team:
>
> As of today incoming request data includes an extra bit of information on
> the X-analytics header.
>
> If an incoming request to any wikipedia project had no cookies whatsoever it
> will be tagged with nocookie=1. A requests without any cookies could
> correspond to a fresh browser session, a user browsing with cookies disabled
> or, most likely, a bot request as most bots will not accept cookies. We
> *might* be able to use this setting as a cheap proxy to quantify bot
> traffic.
>
>
> Documentation about this change can be found here:
> https://wikitech.wikimedia.org/wiki/X-Analytics
>
>
> Thanks,
>
> Nuria
>
>
>
> _______________________________________________
> Analytics mailing list
> Analytics@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/analytics
>



--
Oliver Keyes
Count Logula
Wikimedia Foundation

_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics


_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics



_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics