Re: [Analytics] pitching the Gender Edit Dashboard

28 Aug 2014

And because I know someone is going to point this out... Actually, restricting the data to
only editors who have explicitly set their gender would not completely control for changes
in the rate of setting the preference since that rate could change differently for men and
women. It would at least help to control for overall changes in the rate, for example, due
to the change in the interface that Steven mentioned.

Kaldari

On Aug 28, 2014, at 9:50 AM, Ryan Kaldari &lt;rkaldari(a)wikimedia.org&gt; wrote:

...
  We could restrict the query to only look at editors
who had explicitly set their gender preference. That would control for changes in the rate
of setting the preference. The data would then only be biased by users who had explicitly
set their gender to the incorrect gender, which I imagine would be a very small
percentage.

 Also, I would like to point out that even our most fundamental metrics are affected by
similar biases and inconsistencies. For example, the rate of new editors is polluted by
long-time IP editors who suddenly decide to create an account. If there is an increase in
IP editors converting to registered editors, it can mislead us into thinking that we are
suddenly attracting a lot of new editors. This is just one of many examples I'm sure
you're already familiar with.

 To answer your question though, I think if we notice something interesting in the data
(especially a downward trend), we would start a discussion about it (as we would with any
interesting data) and hopefully inspire someone to dig deeper. Right now though we are
mostly in the dark. See, for example, Phoebe's most recent email to the gendergap list
lamenting the lack of research and data.

 Kaldari

 On Thu, Aug 28, 2014 at 1:43 AM, Aaron Halfaker &lt;ahalfaker(a)wikimedia.org&gt; wrote:
  I think the biggest problem is this:

 Let's say that we see the proportion of users who set their gender preference to
female falling.  Is that because women are becoming less likely to set their gender
preference or because the ratio is actually becoming more extreme?

 Let's say that we see a trend in the messy data.  What do we do about that?  Do we
assume that it is a change in the actual ratio?  Do we assume that it is a change in the
propensity of females to set their gender preference and there's nothing for us to do?
 Or do we then decide that it is important for us to gather good data so that we can
actually know what's going on?

 -Aaron

 On Thu, Aug 28, 2014 at 4:50 AM, Ryan Kaldari &lt;rkaldari(a)wikimedia.org&gt; wrote:
  On Tue, Aug 26, 2014 at 9:53 AM, Leila Zia
&lt;leila(a)wikimedia.org&gt; wrote:
  1. We look at the self-reported gender data and
do some simple observations. 
 Pros:
    + we will have an updated view of the gender gap problem.
    + we may spread seeds for further internal and/or external research about it.
 Cons:
    - If simple observations are not communicated properly, they will result in
misinformation, that can possibly do more harm than good.
    - The results will be very limited given that we know the data is very limited and
contains biases.  
 I would definitely like to avoid spreading misinformation, which is why I proposed only
looking at the percentage change per month rather than raw numbers or raw percentages. The
raw numbers are almost certainly off-base and would be much more likely to be latched onto
by the public and the media. Percentage change per month is a less 'sexy'
statistic, but might give us better clues about what's actually going on with the
gender gap over time. It would also, for the first time, give us some window into how new
features or issues may be actively affecting the gender gap. But again, it would only be a
canary in a coal mine, not a tool to draw reliable conclusions from. For that, we need
more extensive tools and analysis.

  2. We do extensive gender gap analysis
internally.
 Proper gender gap analysis, in a way that can result in meaningful interventions (think
products and features by us or the community) requires one person from R&D to work on
it almost full time for a long period of time (at least six months, more probably a year).
In this case, the question becomes: How should we prioritize this question? Just to give
you some context: Which of the following areas should this one person from R&D work
on?
    * reducing gender gap
    * increasing editor diversity in terms of nationality/language/...
    * increasing the number of active editors independent of gender
    * identifying areas Wikipedia is covered the least and finding editors who can
contribute to those areas
    * ...  
 I think it's very difficult to judge how to set those priorities without having more
data. We know that the active editors number is on a downward trajectory. Is the
nationality/language diversity increasing or decreasing? Is the gender gap increasing or
decreasing? In cases where things are actively getting worse, we should set our priorities
to address them sooner, but without knowing those trajectories it's impossible to
say.

 Kaldari

 _______________________________________________
 Analytics mailing list
 Analytics(a)lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/analytics  

 _______________________________________________
 Analytics mailing list
 Analytics(a)lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/analytics 

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

Re: [Analytics] pitching the Gender Edit Dashboard