I first pitched this idea to Aaron Halfaker in July, but nothing has happened so far, so I wanted to pitch it to the whole analytics team....

The Foundation has been discussing the gender gap and how to address it since I started 4 years ago. Often there is discussion of how particular features or projects might theoretically impact the gender gap: the Education Program, Visual Editor, WikiLove, editathons, etc. Unfortunately, we have absolutely no idea if any of these things have any impact. Nor do we have any idea if the gender gap is getting better or worse or staying the same. All we have is a handful of non-comparable data points based on surveys with different methodologies.

The main obstacle to generating useful gender gap data has always been that we don't have reliable absolute numbers because editors do not reliably indicate their gender in the preferences. There is nothing stopping us, however, from analysing relative trends using existing data. For example, we could generate graphs showing the relative difference per month in edits by men and women and this data would be unaffected by the unreliability of the absolute numbers (since we would only be looking at changes in the percentages).

This is possible right now with existing data and shouldn't be very hard to generate (although the queries will be expensive). To see a full explanation of the idea, please check out the Trello card and add comments there:

Ryan Kaldari