I think the biggest problem is this:

Let's say that we see the proportion of users who set their gender preference to female falling.  Is that because women are becoming less likely to set their gender preference or because the ratio is actually becoming more extreme?

Let's say that we see a trend in the messy data.  What do we do about that?  Do we assume that it is a change in the actual ratio?  Do we assume that it is a change in the propensity of females to set their gender preference and there's nothing for us to do?  Or do we then decide that it is important for us to gather good data so that we can actually know what's going on?


On Thu, Aug 28, 2014 at 4:50 AM, Ryan Kaldari <rkaldari@wikimedia.org> wrote:
On Tue, Aug 26, 2014 at 9:53 AM, Leila Zia <leila@wikimedia.org> wrote:
1. We look at the self-reported gender data and do some simple observations.
   + we will have an updated view of the gender gap problem.
   + we may spread seeds for further internal and/or external research about it.
   - If simple observations are not communicated properly, they will result in misinformation, that can possibly do more harm than good.
   - The results will be very limited given that we know the data is very limited and contains biases.

I would definitely like to avoid spreading misinformation, which is why I proposed only looking at the percentage change per month rather than raw numbers or raw percentages. The raw numbers are almost certainly off-base and would be much more likely to be latched onto by the public and the media. Percentage change per month is a less 'sexy' statistic, but might give us better clues about what's actually going on with the gender gap over time. It would also, for the first time, give us some window into how new features or issues may be actively affecting the gender gap. But again, it would only be a canary in a coal mine, not a tool to draw reliable conclusions from. For that, we need more extensive tools and analysis.

2. We do extensive gender gap analysis internally.
Proper gender gap analysis, in a way that can result in meaningful interventions (think products and features by us or the community) requires one person from R&D to work on it almost full time for a long period of time (at least six months, more probably a year). In this case, the question becomes: How should we prioritize this question? Just to give you some context: Which of the following areas should this one person from R&D work on?
   * reducing gender gap
   * increasing editor diversity in terms of nationality/language/...
   * increasing the number of active editors independent of gender
   * identifying areas Wikipedia is covered the least and finding editors who can contribute to those areas
   * ...

I think it's very difficult to judge how to set those priorities without having more data. We know that the active editors number is on a downward trajectory. Is the nationality/language diversity increasing or decreasing? Is the gender gap increasing or decreasing? In cases where things are actively getting worse, we should set our priorities to address them sooner, but without knowing those trajectories it's impossible to say.


Analytics mailing list