Hi!
I'm a Ph.D. student in economics, using some of the Wikimedia data in my research. My question is whether it's possible to get the data on Wikipedia pageviews by country and article category? Currently the Wikimedia Foundation provides the aggregate data on pageviews by country and the less aggregate data on pageviews by article, but it looks that there is no way to find out, for example, the pageviews of math articles in India.
More specifically, my questions are: 1) If is it possible in some way to extract the information on pageviews by country and subject area from your publicly available data? The amount of data currently available is already vast, and I could miss it. 2) If it is not possible, then how can I persuade you into making this data available? I'm going to argue that the data can be made available without losing confidentiality by using either first IP numbers or by publishing only the country of the user, as well as aggregating by the category.
I'm looking forward to hear from you. I'm sure that many social scientists will be also glad to use the opportunity to produce more interesting and policy-relevant research.
Best regards, Alexander Ugarov, Ph.D. Candidate Sam M. Walton College of Business Department of Economics University of Arkansas
Alexander, thanks for writing. It's possible to get data by category and country, though it is quite hard and limited to internal use at the moment. We are working to both make it easier and available for publishing to the world, but there is a lot of work to be done. We're an open source project, so of course you can contribute to that work, I can link you and others interested. You can also apply for a research project here:
https://meta.wikimedia.org/wiki/Research:New_project
If you apply for a research project, you'll have to sign an NDA to get access to this data, and meet all the requirements of the research team.
Either way, I hope that within a year or so, the kind of question you're asking will be possible to answer with public data.
On Wed, Nov 2, 2016 at 3:22 PM, Alexander Ugarov AUgarov@walton.uark.edu wrote:
Hi!
I'm a Ph.D. student in economics, using some of the Wikimedia data in my research. My question is whether it's possible to get the data on Wikipedia pageviews by country and article category? Currently the Wikimedia Foundation provides the aggregate data on pageviews by country and the less aggregate data on pageviews by article, but it looks that there is no way to find out, for example, the pageviews of math articles in India.
More specifically, my questions are:
- If is it possible in some way to extract the information on pageviews
by country and subject area from your publicly available data? The amount of data currently available is already vast, and I could miss it. 2) If it is not possible, then how can I persuade you into making this data available? I'm going to argue that the data can be made available without losing confidentiality by using either first IP numbers or by publishing only the country of the user, as well as aggregating by the category.
I'm looking forward to hear from you. I'm sure that many social scientists will be also glad to use the opportunity to produce more interesting and policy-relevant research.
Best regards, Alexander Ugarov, Ph.D. Candidate Sam M. Walton College of Business Department of Economics University of Arkansas
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Hi Alexander,
If you have relevant skills with developing software, and the interest and time to help out, then WMF Project Grants might be able to provide you with some financial support for work in this area. Have a look at https://meta.wikimedia.org/wiki/Grants:Project
Pine
On Wed, Nov 16, 2016 at 7:24 AM, Dan Andreescu dandreescu@wikimedia.org wrote:
Alexander, thanks for writing. It's possible to get data by category and country, though it is quite hard and limited to internal use at the moment. We are working to both make it easier and available for publishing to the world, but there is a lot of work to be done. We're an open source project, so of course you can contribute to that work, I can link you and others interested. You can also apply for a research project here:
https://meta.wikimedia.org/wiki/Research:New_project
If you apply for a research project, you'll have to sign an NDA to get access to this data, and meet all the requirements of the research team.
Either way, I hope that within a year or so, the kind of question you're asking will be possible to answer with public data.
On Wed, Nov 2, 2016 at 3:22 PM, Alexander Ugarov AUgarov@walton.uark.edu wrote:
Hi!
I'm a Ph.D. student in economics, using some of the Wikimedia data in my research. My question is whether it's possible to get the data on Wikipedia pageviews by country and article category? Currently the Wikimedia Foundation provides the aggregate data on pageviews by country and the less aggregate data on pageviews by article, but it looks that there is no way to find out, for example, the pageviews of math articles in India.
More specifically, my questions are:
- If is it possible in some way to extract the information on pageviews
by country and subject area from your publicly available data? The amount of data currently available is already vast, and I could miss it. 2) If it is not possible, then how can I persuade you into making this data available? I'm going to argue that the data can be made available without losing confidentiality by using either first IP numbers or by publishing only the country of the user, as well as aggregating by the category.
I'm looking forward to hear from you. I'm sure that many social scientists will be also glad to use the opportunity to produce more interesting and policy-relevant research.
Best regards, Alexander Ugarov, Ph.D. Candidate Sam M. Walton College of Business Department of Economics University of Arkansas
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics