bots and spiders. We know that the > wikimedia rest api
provides this
information going back to July 1 2015.
Please have in mind that these are only self-identified bots, there is
probably about 1-5% of bot pageview traffic that gets wrongly labeled as
"user", a project is on its way to better label this traffic as coming from
bots.
On Tue, Nov 13, 2018 at 6:41 AM Jennifer Pan <jp1(a)stanford.edu> wrote:
Hi there,
I'm an assistant professor in the Department of Communication at Stanford.
My co-author, Molly Roberts (Political Science, UCSD), and I are working on
a paper examining the effect of China's 2015 block of Chinese language
wikipedia on pageviews, which builds on our previous work on censorship in
China.
We are using the block to conduct a interrupted time series design to
measure the effect of censorship on Chinese users. Our main finding is that
Chinese users were using Wikipedia to browse (starting at the home page),
and the block influenced users' ability to explore and encounter unexpected
information. One question we have is whether the pageviews we observe are
driven by bots and spiders. We know that the wikimedia rest api provides
this information going back to July 1 2015. Since the China block of
Wikipedia was on May 19, 2015, we are wondering if there is pageview data
by agent type for
zh.wikipedia.org pages (all or some subset like most
popular) going back to May 2015 (specifically May 18-21, 2015)? From
https://meta.wikimedia.org/wiki/Research:Timeline_of_Wikimedia_analytics,
it says that pageview data is available in bulk starting on May 1, 2015,
so we thought maybe there was some chance this data exists.
Any suggestions would be greatly appreciated, and if this is not possible,
please let us know.
Thank you!
Jennifer Pan
_______________________________________________
Analytics mailing list
Analytics(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics