Hi there,
Although this doesn't answer your specific question, I thought that I'd
share that my observations from watching traffic patterns on some Wikimedia
pages suggests that the classification of readers into bot, spider, or
human has some margin of error, but I don't know what the margin of error
is. The margin of error might be worth considering as you analyze the
traffic that interests you, especially if you have reason to believe that
the margin of error is statistically significant.
Pine
(
https://meta.wikimedia.org/wiki/User:Pine )
On Tue, Nov 13, 2018 at 2:41 PM Jennifer Pan <jp1(a)stanford.edu> wrote:
Hi there,
I'm an assistant professor in the Department of Communication at Stanford.
My co-author, Molly Roberts (Political Science, UCSD), and I are working on
a paper examining the effect of China's 2015 block of Chinese language
wikipedia on pageviews, which builds on our previous work on censorship in
China.
We are using the block to conduct a interrupted time series design to
measure the effect of censorship on Chinese users. Our main finding is that
Chinese users were using Wikipedia to browse (starting at the home page),
and the block influenced users' ability to explore and encounter unexpected
information. One question we have is whether the pageviews we observe are
driven by bots and spiders. We know that the wikimedia rest api provides
this information going back to July 1 2015. Since the China block of
Wikipedia was on May 19, 2015, we are wondering if there is pageview data
by agent type for
zh.wikipedia.org pages (all or some subset like most
popular) going back to May 2015 (specifically May 18-21, 2015)? From
https://meta.wikimedia.org/wiki/Research:Timeline_of_Wikimedia_analytics,
it says that pageview data is available in bulk starting on May 1, 2015,
so we thought maybe there was some chance this data exists.
Any suggestions would be greatly appreciated, and if this is not possible,
please let us know.
Thank you!
Jennifer Pan
_______________________________________________
Analytics mailing list
Analytics(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics