FWIW: depending on the threshold chosen in step 2 of Anonymization
suggested by Yuvi, some of the countries/languages will have no data. This
data will solve the problem for some of the partners, but not all of them.
On Monday, August 25, 2014, Jessie Wild <jwild(a)wikimedia.org> wrote:
THIS IS SO USEFUL!
For grantmaking, this is the exact type of dataset we want to have
publicly available. A lot of the initiatives we fund are at a country-based
level, and our partners have a really hard time understanding the effects
of the work they are doing on the aggregate language-wiki level. In
addition to this edits per country, it would be even more important for us
to get the total number of editors / active editors by country as well.
Kevin - it would be great to get an update from on the timeline for this
(in Q4 2014-15, it was punted to Q1 2014-15, but I haven't heard anything
about it yet ...)
Thanks for starting this work, Yuvi!
On Mon, Aug 25, 2014 at 9:43 AM, Yuvi Panda <yuvipanda(a)gmail.com
On Mon, Aug 25, 2014 at 5:41 PM, Kevin Leduc
this sounds like very interesting data to look at. Here are my
- the Anonymization scheme sounds reasonable, and
I'd like to hear from
someone else @ wikimedia who has similar experience anonymizing data
Glad to hear that!
- you were probably already thinking about it,
but we need documentation
too: a wikipage with the name of the table, data dictionary, etc... and
a blog post to announce the newly available data.
Oh yeah, definitely. Will come once the code, etc is done :)
Yuvi Panda T
Analytics mailing list
*Jessie Wild SnellerGrantmaking Learning & Evaluation *
Imagine a world in which every single human being can freely share in
the sum of all knowledge. Help us make it a reality!
Donate to Wikimedia <https://donate.wikimedia.org/>