FWIW: depending on the threshold chosen in step 2 of Anonymization suggested by Yuvi, some of the countries/languages will have no data. This data will solve the problem for some of the partners, but not all of them.
On Monday, August 25, 2014, Jessie Wild <email@example.com> wrote:
THIS IS SO USEFUL!For grantmaking, this is the exact type of dataset we want to have publicly available. A lot of the initiatives we fund are at a country-based level, and our partners have a really hard time understanding the effects of the work they are doing on the aggregate language-wiki level. In addition to this edits per country, it would be even more important for us to get the total number of editors / active editors by country as well. Kevin - it would be great to get an update from on the timeline for this (in Q4 2014-15, it was punted to Q1 2014-15, but I haven't heard anything about it yet ...)Thanks for starting this work, Yuvi!JessieOn Mon, Aug 25, 2014 at 9:43 AM, Yuvi Panda <firstname.lastname@example.org> wrote:
On Mon, Aug 25, 2014 at 5:41 PM, Kevin Leduc <email@example.com> wrote::D
> Hey Yuvi,
> this sounds like very interesting data to look at. Here are my thoughts:
Glad to hear that!
> - the Anonymization scheme sounds reasonable, and I'd like to hear from
> someone else @ wikimedia who has similar experience anonymizing data sets
Oh yeah, definitely. Will come once the code, etc is done :)
> - you were probably already thinking about it, but we need documentation
> too: a wikipage with the name of the table, data dictionary, etc... and even
> a blog post to announce the newly available data.
Yuvi Panda T
Analytics mailing list
--Jessie Wild Sneller
Grantmaking Learning & EvaluationWikimedia Foundation