Alexander,
Please ask questions such us this one on our public list (cc-ed)
We do not have public data available when it comes to pages per country
other than what is available here:
https://stats.wikimedia.
org/wikimedia/squids/SquidReportPageViewsPerCountryBreakdown.htm
You can however set up a collaboration with our research team to have
access to private data, but please be aware that our research team can
handle so many collaborations at any one time.
This data can be made available without losing
confidentiality by using
either only first IP-address numbers or by publishing only
the country of
users, as well as >aggregating by the category.
No, not really. Countries like San Marino or Andorra are not much bigger
than cities thus anonymizing by country might leak a lot of information.
Specially if you include pageview titles.
Thanks,
Nuria
On Wed, Dec 7, 2016 at 7:04 AM, Alexander Ugarov <augarov(a)email.uark.edu>
wrote:
Dear Ms. Ruiz,
I am conducting the research project on the international determinants of
education quality. In my view, Wikimedia statistics is the priceless
resource of information on how much learning people do on private. The
statistics you made available was very precious for me so far. I will
greatly appreciate your help with getting the access to the data which is
not readily available on Wikimedia Foundation website.
I would like to access the data on Wikipedia pageviews by country,
language and content area to measure the private learning in different
countries. My previous empirical results suggest that Wikipedia pageviews
are highly correlated with education quality. Unfortunately, the available
data does not allow to separate the educational pageviews from the pure
entertainment purposes (for example, celebrities biographies).
I will appreciate if you answer two specific questions:
1) If is it potentially possible to extract the information on pageviews
by country and subject from your publicly available data? I can program and
extract the information as soon as it is there.
2) If this information can not be extracted from the publicly available
dataset, then if is it possible to make it available for me or researchers
in general? This data can be made available without losing confidentiality
by using either only first IP-address numbers or by publishing only the
country of users, as well as aggregating by the category.
I am looking forward to hear from you on availability of this data. I am
sure that many social scientists will also benefit from using such
information (if you make it availlable) and will produce some
policy-relevant research.
Best regards,
Alexander Ugarov,
Sam M. Walton College of Business
Department of Economics
University of Arkansas.