Dear members of the Analytics Team,
I am currently conducting research about the excludability of free knowledge available on the Wikimedia projects as an example of a public good. In order to calibrate the model, I need aggregate data on the page views and edits by country and language.
After having carefully read Research:Data https://meta.wikimedia.org/wiki/Research:Data, I was only able to find data on page views by country and language, which would be enough to calibrate the demand side of my model. So, is it possible to get aggregate data on edits by country and language, which are similar to those on page views available at WikiStats?
Thanks in advance.
Best regards, Kiril Simeonovski
Hi Kiril,
We have editors by country here: https://dumps.wikimedia.org/other/geoeditors/readme.html and visualized here: https://stats.wikimedia.org/#/en.wikipedia.org/contributing/active-editors-b... And we do have edits by country and language but we don't publish it except to the Global Innovation Index folks who publish a yearly report. The internal table is described here: https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Edits/Geoeditors#Geo...
Can you take a look at the editors by country and let me know if the edits monthly table and data would be more useful? What we could publish externally would be more limited, without some of the smaller projects (I forget exactly the cut-offs and sanitization that happens).
On Wed, May 17, 2023 at 5:00 PM Kiril Simeonovski < kiril.simeonovski@gmail.com> wrote:
Dear members of the Analytics Team,
I am currently conducting research about the excludability of free knowledge available on the Wikimedia projects as an example of a public good. In order to calibrate the model, I need aggregate data on the page views and edits by country and language.
After having carefully read Research:Data https://meta.wikimedia.org/wiki/Research:Data, I was only able to find data on page views by country and language, which would be enough to calibrate the demand side of my model. So, is it possible to get aggregate data on edits by country and language, which are similar to those on page views available at WikiStats?
Thanks in advance.
Best regards, Kiril Simeonovski _______________________________________________ Analytics mailing list -- analytics@lists.wikimedia.org To unsubscribe send an email to analytics-leave@lists.wikimedia.org
Hi Dan,
Thanks for your response.
I have carefully checked all the links that you shared with me and it looks like the monthly edits are what should work the best to calibrate the supply side of my model. With regards to the granularity of data per country and language, it is absolutely fine if only the projects and languages with largest shares are listed, whereas all others are stockpiled in the "Other" category. Here are some good examples that display outdated aggregate data on edits: * https://stats.wikimedia.org/wikimedia/squids/SquidReportPageEditsPerLanguage... * https://stats.wikimedia.org/wikimedia/squids/SquidReportPageEditsPerCountryB... * https://meta.wikimedia.org/wiki/Edits_by_project_and_country_of_origin
In fact, my ultimate research goal is to calculate how much free knowledge on the Wikimedia projects is lost across countries in the Global South and Global North because of socio-economic and institutional factors that leave people with no access to the Wikimedia projects (page views per country/language are a perfect proxy on the demand side; in the same way, edits per country/language should be a perfect metric on the supply side).
Thanks.
Best regards, Kiril
On Thu, May 18, 2023 at 5:21 PM Dan Andreescu dandreescu@wikimedia.org wrote:
Hi Kiril,
We have editors by country here: https://dumps.wikimedia.org/other/geoeditors/readme.html and visualized here: https://stats.wikimedia.org/#/en.wikipedia.org/contributing/active-editors-b... And we do have edits by country and language but we don't publish it except to the Global Innovation Index folks who publish a yearly report. The internal table is described here: https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Edits/Geoeditors#Geo...
Can you take a look at the editors by country and let me know if the edits monthly table and data would be more useful? What we could publish externally would be more limited, without some of the smaller projects (I forget exactly the cut-offs and sanitization that happens).
On Wed, May 17, 2023 at 5:00 PM Kiril Simeonovski < kiril.simeonovski@gmail.com> wrote:
Dear members of the Analytics Team,
I am currently conducting research about the excludability of free knowledge available on the Wikimedia projects as an example of a public good. In order to calibrate the model, I need aggregate data on the page views and edits by country and language.
After having carefully read Research:Data https://meta.wikimedia.org/wiki/Research:Data, I was only able to find data on page views by country and language, which would be enough to calibrate the demand side of my model. So, is it possible to get aggregate data on edits by country and language, which are similar to those on page views available at WikiStats?
Thanks in advance.
Best regards, Kiril Simeonovski _______________________________________________ Analytics mailing list -- analytics@lists.wikimedia.org To unsubscribe send an email to analytics-leave@lists.wikimedia.org
Analytics mailing list -- analytics@lists.wikimedia.org To unsubscribe send an email to analytics-leave@lists.wikimedia.org
Dear Dan,
I would like to follow up on my previous email regarding the data on the number of edits per country.
The editors per country data that you pointed out to at https://dumps.wikimedia.org/other/geoeditors/readme.html contain comprehensive monthly statistics on the frequency distribution of editors per edit intervals. These values seem to have been derived from raw data on the number of edits made by individual editors across countries. I would like to know if it is possible to get aggregate figures on the edits by country.
Thanks in advance.
Best regards, Kiril
On Fri, 26 May 2023 at 20:30, Kiril Simeonovski kiril.simeonovski@gmail.com wrote:
Hi Dan,
Thanks for your response.
I have carefully checked all the links that you shared with me and it looks like the monthly edits are what should work the best to calibrate the supply side of my model. With regards to the granularity of data per country and language, it is absolutely fine if only the projects and languages with largest shares are listed, whereas all others are stockpiled in the "Other" category. Here are some good examples that display outdated aggregate data on edits:
https://stats.wikimedia.org/wikimedia/squids/SquidReportPageEditsPerLanguage...
https://stats.wikimedia.org/wikimedia/squids/SquidReportPageEditsPerCountryB...
In fact, my ultimate research goal is to calculate how much free knowledge on the Wikimedia projects is lost across countries in the Global South and Global North because of socio-economic and institutional factors that leave people with no access to the Wikimedia projects (page views per country/language are a perfect proxy on the demand side; in the same way, edits per country/language should be a perfect metric on the supply side).
Thanks.
Best regards, Kiril
On Thu, May 18, 2023 at 5:21 PM Dan Andreescu dandreescu@wikimedia.org wrote:
Hi Kiril,
We have editors by country here: https://dumps.wikimedia.org/other/geoeditors/readme.html and visualized here: https://stats.wikimedia.org/#/en.wikipedia.org/contributing/active-editors-b... And we do have edits by country and language but we don't publish it except to the Global Innovation Index folks who publish a yearly report. The internal table is described here: https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Edits/Geoeditors#Geo...
Can you take a look at the editors by country and let me know if the edits monthly table and data would be more useful? What we could publish externally would be more limited, without some of the smaller projects (I forget exactly the cut-offs and sanitization that happens).
On Wed, May 17, 2023 at 5:00 PM Kiril Simeonovski < kiril.simeonovski@gmail.com> wrote:
Dear members of the Analytics Team,
I am currently conducting research about the excludability of free knowledge available on the Wikimedia projects as an example of a public good. In order to calibrate the model, I need aggregate data on the page views and edits by country and language.
After having carefully read Research:Data https://meta.wikimedia.org/wiki/Research:Data, I was only able to find data on page views by country and language, which would be enough to calibrate the demand side of my model. So, is it possible to get aggregate data on edits by country and language, which are similar to those on page views available at WikiStats?
Thanks in advance.
Best regards, Kiril Simeonovski _______________________________________________ Analytics mailing list -- analytics@lists.wikimedia.org To unsubscribe send an email to analytics-leave@lists.wikimedia.org
Analytics mailing list -- analytics@lists.wikimedia.org To unsubscribe send an email to analytics-leave@lists.wikimedia.org