Just a quick note to clarify that this change only filters out bots whose requests carry a user agent string that identifies them as such. You can track our tasks related to identifying nonevident bots in Phabricator task T138207.

Thanks!

On Wed, May 24, 2017 at 6:35 PM, Jon Katz <jkatz@wikimedia.org> wrote:
Nice change!  Thanks.

On Wed, May 24, 2017 at 8:05 AM, Tilman Bayer <tbayer@wikimedia.org> wrote:
Thanks Francisco! 
To express it from the perspective of users of this data: The results of your EventlLogging queries may change slightly, but for the better, improving accuracy. (In the past e.g. GoogleBot has shown up in schemas for mobile web and the Android Wikipedia app.)

On Wed, May 24, 2017 at 4:54 AM, Francisco Dans <fdans@wikimedia.org> wrote:
Hi all,

Today we'll be deploying a change that affects how events triggered by bots/spiders are stored. We have added a property to the user agent map in the event capsule called is_bot, which we use to prevent them from being persisted in MySQL, and store them only in Hadoop.

For more information on this change refer to phab task T67508.

Thank you!

--
Francisco Dans
Software Engineer, Analytics Team
Wikimedia Foundation

_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics




--
Tilman Bayer
Senior Analyst
Wikimedia Foundation
IRC (Freenode): HaeB

_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics



_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics




--
Francisco Dans
Software Engineer, Analytics Team
Wikimedia Foundation