This is useless unless someone is going to start blocking bots thatOn Thu, Jan 28, 2016 at 11:15 AM, Marcel Ruiz Forns
<mforns@wikimedia.org> wrote:
> Hi analytics list,
>
> In the past months the WikimediaBot convention has been mentioned in a
> couple threads, but we (Analytics team) never finished establishing and
> advertising it. In this email we explain what the convention is today and
> what purpose it serves. And also ask for feedback to be sure we can continue
> with the next steps.
>
> What is the WikimediaBot convention?
> It is a way of better identifying Wikimedia traffic originated by bots.
> Today we know that a significant share of Wikimedia traffic comes from bots.
> We can recognize a part of that traffic with regular expressions[1], but we
> can not recognize all of it, because some bots do not identify themselves as
> such. If we could identify a greater part of the bot traffic, we could also
> better isolate the human traffic and permit more accurate analyses.
>
> Who should follow the convention?
> Computer programs that access Wikimedia sites or the Wikimedia API for
> reading purposes* in a periodic, scheduled or automatically triggered way.
>
> Who should NOT follow the convention?
> Computer programs that follow the on-site ad-hoc commands of a human, like
> browsers. And well known spiders that are otherwise recognizable by their
> well known user-agent strings.
>
> How to follow the convention?
> The client's user-agent string should contain the word "WikimediaBot". The
> word can be anywhere within the user-agent string and is case-sensitive.
dont follow it.
There is an existing policy, which is not being followed / enforced.
https://meta.wikimedia.org/wiki/User-Agent_policy
It is also extremely annoying that clients (e.g. Pywikibot) now needs
to add a Wikimedia specific tag to their user-agent. A user-agent
should be client specific, not server specific. Why not just "Bot",
or "MediaWikiBot" which at least encompasses all sites that the client
can communicate with.
--
John Vandenberg
_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics