Clearly Wikipedia et al. uses bot to refer to automated software that edits the site but it seems like you are using the term bot to refer to all automated software and it might be good to clarify.
OK, in the documentation we can make that clear. And looking into that, I've seen that some bots, in the process of doing their "editing" work can also generate pageviews. So we should also include them as potential source of pageview traffic. Maybe we can reuse the existing User-Agent policy.
This makes a lot of sense. If I build a bot that crawls wikipedia and
facebook public pages it really doesn't make sense that my bot has a "wikimediaBot" user agent, just the word "Bot" should probably be enough.
Totally agree.
I guess a bigger question is why try to differentiate between "spiders" and "bots"
at all?
I don't think we need to differentiate between "spiders" and "bots". The most important question we want to respond is: how much of the traffic we consider "human" today is actually "bot". So, +1 "bot" (case-insensitive).
On Fri, Jan 29, 2016 at 9:16 PM, John Mark Vandenberg jayvdb@gmail.com wrote:
On 28 Jan 2016 11:28 pm, "Marcel Ruiz Forns" mforns@wikimedia.org wrote:
Why not just "Bot", or "MediaWikiBot" which at least encompasses all
sites that the client
can communicate with.
I personally agree with you, "MediaWikiBot" seems to have better
semantics.
For clients accessing the MediaWiki api, it is redundant. All it does is identify bots that comply with this edict from analytics.
-- John Vandenberg
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics