Yay to more people finding it useful :)
Editors / Active editors isn't too hard to had programatically. The bigger problem is how to define 'editor from country' - one edit from that country? Does that mean that one editor can be considered to be from multiple countries? Do we double count mobile and desktop as separate?
An easy way to do this would be: 1. An 'editor from a country' is someone who has made at least one edit from that country 2. A 'desktop editor from a country' is someone who has made at least one edit from that country on desktop 3. A 'mobile editor from a country' is someone who has made at least one edit from that country on mobile
This muddles the data some what, since sum(editors_from_all_countries_for_a_project) != total_editors_for_project, and also sum(mobile_editors, desktop_editors) per country != total_editors per country. However, this is super simple to implement and also still useful, so I might end up doing that.
Of course, assuming this entire thing gets OK'd fully by analytics :)
On Mon, Aug 25, 2014 at 6:14 PM, Jessie Wild jwild@wikimedia.org wrote:
THIS IS SO USEFUL!
For grantmaking, this is the exact type of dataset we want to have publicly available. A lot of the initiatives we fund are at a country-based level, and our partners have a really hard time understanding the effects of the work they are doing on the aggregate language-wiki level. In addition to this edits per country, it would be even more important for us to get the total number of editors / active editors by country as well. Kevin - it would be great to get an update from on the timeline for this (in Q4 2014-15, it was punted to Q1 2014-15, but I haven't heard anything about it yet ...)
Thanks for starting this work, Yuvi! Jessie
On Mon, Aug 25, 2014 at 9:43 AM, Yuvi Panda yuvipanda@gmail.com wrote:
On Mon, Aug 25, 2014 at 5:41 PM, Kevin Leduc kevin@wikimedia.org wrote:
Hey Yuvi,
this sounds like very interesting data to look at. Here are my thoughts:
:D
- the Anonymization scheme sounds reasonable, and I'd like to hear from
someone else @ wikimedia who has similar experience anonymizing data sets
Glad to hear that!
- you were probably already thinking about it, but we need documentation
too: a wikipage with the name of the table, data dictionary, etc... and even a blog post to announce the newly available data.
Oh yeah, definitely. Will come once the code, etc is done :)
-- Yuvi Panda T http://yuvi.in/blog
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
-- Jessie Wild Sneller Grantmaking Learning & Evaluation Wikimedia Foundation
Imagine a world in which every single human being can freely share in the sum of all knowledge. Help us make it a reality! Donate to Wikimedia