FWIW: depending on the threshold chosen in step 2 of Anonymization suggested by Yuvi, some of the countries/languages will have no data. This data will solve the problem for some of the partners, but not all of them.
On Monday, August 25, 2014, Jessie Wild <jwild@wikimedia.org> wrote:
THIS IS SO USEFUL!For grantmaking, this is the exact type of dataset we want to have publicly available. A lot of the initiatives we fund are at a country-based level, and our partners have a really hard time understanding the effects of the work they are doing on the aggregate language-wiki level. In addition to this edits per country, it would be even more important for us to get the total number of editors / active editors by country as well. Kevin - it would be great to get an update from on the timeline for this (in Q4 2014-15, it was punted to Q1 2014-15, but I haven't heard anything about it yet ...)Thanks for starting this work, Yuvi!JessieOn Mon, Aug 25, 2014 at 9:43 AM, Yuvi Panda <yuvipanda@gmail.com> wrote:
On Mon, Aug 25, 2014 at 5:41 PM, Kevin Leduc <kevin@wikimedia.org> wrote::D
> Hey Yuvi,
>
> this sounds like very interesting data to look at. Here are my thoughts:
Glad to hear that!
> - the Anonymization scheme sounds reasonable, and I'd like to hear from
> someone else @ wikimedia who has similar experience anonymizing data sets
Oh yeah, definitely. Will come once the code, etc is done :)
> - you were probably already thinking about it, but we need documentation
> too: a wikipage with the name of the table, data dictionary, etc... and even
> a blog post to announce the newly available data.
--
Yuvi Panda T
http://yuvi.in/blog
_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
--Jessie Wild Sneller
Grantmaking Learning & EvaluationWikimedia Foundation
Imagine a world in which every single human being can freely share in
the sum of all knowledge. Help us make it a reality!
Donate to Wikimedia