This was a survey of new account holders (not necessarily editors). The
results were 67% male, 22% female, 11% prefer not to say. I think the
survey was useful in that it let us know that the gender gap exists as
early as the account sign-up funnel.
Kaldari
On Thu, Aug 28, 2014 at 1:25 PM, Andrew Gray <andrew.gray(a)dunelm.org.uk>
wrote:
I believe we did a one-question gender
microsurvey before (linked to
one of the new-user features?). I don't know whether the data was
useful or not, but I do remember the act of asking the question itself
got some pushback as being invasive/unwelcoming/weirdly
communicated/etc. (and I can certainly symapthise with this)
So as well as the value of the data, we should consider whether the
act/method of asking is going to have knock-on effects on what we're
trying to measure.
Andrew.
On 28 August 2014 20:55, Jonathan Morgan <jmorgan(a)wikimedia.org> wrote:
Stepping back...
We all seem to agree that user-set gender preference is a problematic
measure. We don't trust it. We can come up with plausible hypotheses
for why
someone would mis-report their gender. And we can
be almost certain
it's not
a representative sample.
Do we have any ideas for what a better measure would be? Seems to me
that
we're dealing with self-report data no matter
what. But perhaps a more
explicit elicitation would be better? Folks have suggested a
one-question
gender microsurvey before. Of course that will
come with its own
sources of
bias, and I don't quite see how we can
control for them.
Given that it would be useful to have some data on gendered editing
patterns
(whether we share it publicly or not), what are
our options?
- Jonathan
On Thu, Aug 28, 2014 at 10:03 AM, Ryan Kaldari <rkaldari(a)wikimedia.org>
wrote:
>
> And because I know someone is going to point this out... Actually,
> restricting the data to only editors who have explicitly set their
gender
> would not completely control for changes in
the rate of setting the
> preference since that rate could change differently for men and women.
It
> would at least help to control for overall
changes in the rate, for
example,
> due to the change in the interface that
Steven mentioned.
>
> Kaldari
>
> On Aug 28, 2014, at 9:50 AM, Ryan Kaldari <rkaldari(a)wikimedia.org>
wrote:
>
> We could restrict the query to only look at editors who had explicitly
set
> their gender preference. That would control
for changes in the rate of
> setting the preference. The data would then only be biased by users
who had
> explicitly set their gender to the incorrect
gender, which I imagine
would
> be a very small percentage.
>
> Also, I would like to point out that even our most fundamental metrics
are
> affected by similar biases and
inconsistencies. For example, the rate
of new
> editors is polluted by long-time IP editors
who suddenly decide to
create an
> account. If there is an increase in IP
editors converting to registered
> editors, it can mislead us into thinking that we are suddenly
attracting a
> lot of new editors. This is just one of many
examples I'm sure you're
> already familiar with.
>
> To answer your question though, I think if we notice something
interesting
> in the data (especially a downward trend), we
would start a discussion
about
> it (as we would with any interesting data)
and hopefully inspire
someone to
> dig deeper. Right now though we are mostly in
the dark. See, for
example,
> Phoebe's most recent email to the
gendergap list lamenting the lack of
> research and data.
>
> Kaldari
>
>
> On Thu, Aug 28, 2014 at 1:43 AM, Aaron Halfaker <
ahalfaker(a)wikimedia.org>
> wrote:
>>
>> I think the biggest problem is this:
>>
>> Let's say that we see the proportion of users who set their gender
>> preference to female falling. Is that because women are becoming less
>> likely to set their gender preference or because the ratio is actually
>> becoming more extreme?
>>
>> Let's say that we see a trend in the messy data. What do we do about
>> that? Do we assume that it is a change in the actual ratio? Do we
assume
>> that it is a change in the propensity of
females to set their gender
>> preference and there's nothing for us to do? Or do we then decide
that it
>> is important for us to gather good data
so that we can actually know
what's
> going
on?
>
> -Aaron
>
>
> On Thu, Aug 28, 2014 at 4:50 AM, Ryan Kaldari <rkaldari(a)wikimedia.org
>> wrote:
>>>
>>> On Tue, Aug 26, 2014 at 9:53 AM, Leila Zia <leila(a)wikimedia.org>
wrote:
>>>>
>>>> 1. We look at the self-reported gender data and do some simple
>>>> observations.
>>>> Pros:
>>>> + we will have an updated view of the gender gap problem.
>>>> + we may spread seeds for further internal and/or external
research
>>>> about it.
>>>> Cons:
>>>> - If simple observations are not communicated properly, they will
>>>> result in misinformation, that can possibly do more harm than good.
>>>> - The results will be very limited given that we know the data is
>>>> very limited and contains biases.
>>>
>>>
>>> I would definitely like to avoid spreading misinformation, which is
why
>>> I proposed only looking at the
percentage change per month rather
than raw
>>> numbers or raw percentages. The raw
numbers are almost certainly
off-base
>>> and would be much more likely to be
latched onto by the public and
the
>>> media. Percentage change per month is
a less 'sexy' statistic, but
might
>>> give us better clues about what's
actually going on with the gender
gap over
>>> time. It would also, for the first
time, give us some window into
how new
>>> features or issues may be actively
affecting the gender gap. But
again, it
>>> would only be a canary in a coal
mine, not a tool to draw reliable
>>> conclusions from. For that, we need more extensive tools and
analysis.
>>>
>>>> 2. We do extensive gender gap analysis internally.
>>>> Proper gender gap analysis, in a way that can result in meaningful
>>>> interventions (think products and features by us or the community)
requires
>>>> one person from R&D to work
on it almost full time for a long
period of time
>>>> (at least six months, more
probably a year). In this case, the
question
>>>> becomes: How should we prioritize
this question? Just to give you
some
>>>> context: Which of the following
areas should this one person from
R&D work
>>>> on?
>>>> * reducing gender gap
>>>> * increasing editor diversity in terms of
nationality/language/...
>>>> * increasing the number of
active editors independent of gender
>>>> * identifying areas Wikipedia is covered the least and finding
>>>> editors who can contribute to those areas
>>>> * ...
>>>
>>>
>>> I think it's very difficult to judge how to set those priorities
without
>>> having more data. We know that the
active editors number is on a
downward
>>> trajectory. Is the
nationality/language diversity increasing or
decreasing?
>>> Is the gender gap increasing or
decreasing? In cases where things are
>>> actively getting worse, we should set our priorities to address them
sooner,
>>
but without knowing those trajectories it's impossible to say.
>>
>> Kaldari
>>
>> _______________________________________________
>> Analytics mailing list
>> Analytics(a)lists.wikimedia.org
>>
https://lists.wikimedia.org/mailman/listinfo/analytics
>>
>
>
> _______________________________________________
> Analytics mailing list
> Analytics(a)lists.wikimedia.org
>
https://lists.wikimedia.org/mailman/listinfo/analytics
>
_______________________________________________
Analytics mailing list
Analytics(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
--
Jonathan T. Morgan
Learning Strategist
Wikimedia Foundation
User:Jmorgan (WMF)
jmorgan(a)wikimedia.org
_______________________________________________
Analytics mailing list
Analytics(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
--
- Andrew Gray
andrew.gray(a)dunelm.org.uk
_______________________________________________
Analytics mailing list
Analytics(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
_______________________________________________
Analytics mailing list
Analytics(a)lists.wikimedia.org