Sage Ross, I think you've missed my point. My point was that the
number of editors identifying as female is an entirely different piece
of data than the number of females editing Wikipedia, and one should
not be used as a surrogate for the other. That is as true for this
most recent data as for the data I was cautioning about earlier; it's
based on self-identification and should not be taken as an estimate of
women editing Wikipedia.
I disagree strongly with the statement "At first glance, it would seem
that the gender gap is larger among very active editors." Maybe at a
layman's first glance, that's the case, but a statistician glancing at
these numbers doesn't see that at all. What I see is the conflation
of two different kinds of data. You cannot conclude, even
tentatively, from these data whether the numbers relating to editors
who self-identify by gender has anything to do with female
participation among Wikipedia editors as a whole. As I said before,
it's entirely possible, even probable, that editors who take the
trouble to self-identify by gender are different in other important
ways from those who don't, so it could be very misleading to
generalize from one population to the other.
Also, the suggestion even with a caveat, that at first glance these
data seem to show that "the gender gap is larger among very active
editors" is not a valid suggestion and does not accurately reflect the
data. As far as the data can tell us, the explanation that women who
know Wikipedia well are less likely to self-identify by gender, is as
likely as the explanation that fewer women are likely to be active
editors. Which one of these explanations is a more likely reflection
of reality simply can't be determined from these data.
By the way, some of the percentages are wrong. The male "percentage
of total" column is right for part of the column and then veers off;
it appears that from some point on, the percentage was determined by
dividing the number of self-identified males within an edit-count
category by the number of non-self-identifying editors. For example,
the number in the 65535 row identified as 66% (of editors in that
edit category identifying as male) should actually be 39%; the number
in the 32767 row identified as 52% should actually be 33%, and so
forth. Some of the percentages for women are also wrong; the number
identified as 4% in the 65535 row should be 2% for example. I didn't
have time to go through and calculate every one, but those are some
representative inaccurate numbers.
What I see that's interesting in these numbers is something different
than others are seeing; while very few females self-identify as
female, actually the percentage of more active editors identifying as
female is twice the percentage of less active editors identifying as
female (1% up to 4,000 edits, 2-3% above that). But these 1-3% of
females identifying as female and editing Wikipedia aren't, or
shouldn't be, the subjects of interest to this discussion. The more
useful question is, what part of the great bulk of Wikipedians who
don't self-identify by gender are female? You don't know the answer
to that question; you can't estimate the answer to that question using
these data that answer a different question. You need more data about
female participation, before you charge off generating strategies.
You need to know what the problem is before you can develop
strategies that have any meaningful chance of solving the problem.
Woonpton
On 2/11/11, Sage Ross <sross(a)wikimedia.org> wrote:
We're crossing streams a bit between this list and
wikitech-l.
On Fri, Feb 11, 2011 at 10:58 AM, Lars Aronsson <lars(a)aronsson.se>
wrote on wikitech-l:
One thing that could be interesting is to trace
the career of users: When they register, how
frequent they edit, if the frequency varies
over time, and if these patterns differ between
men and women and the gender-anonymous.
User:Dispenser is working on something similar, I think for the next
Signpost.
Take a look at this (a work in progress and not mine, so please don't
distribute):
http://toolserver.org/~dispenser/temp/gender/total_edit_zero_2011-02-10.png
The table at the left traces gender identification rates for editors
with less than or equal to the listed number of edits (but more than
the previous row). So the first row is editors with 0 edits, the
second is editors with 1 edit, the third is editors with 2-3 edits,
then 4-7 edits, etc. The last row is everyone with over ~65k edits
(and less than 5,000,000). It's based on essentially the 250,000 most
recent users who have edited or created an account.
So the takeaways are:
a) the more edits you make, the more likely you are to declare your gender.
b) the ratio of declared females to males falls from about 20% for
people who make just zero or one edit, to a stable 5-6% for people who
make 1000 or more edits.
Of course, as Woonpton notes, there could be factors that distort
that. Maybe women who become active editors are more likely than
other women to *not* declare gender. But at first glance, it would
seem that the gender gap is larger among very active editors.
-Sage
_______________________________________________
Gendergap mailing list
Gendergap(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/gendergap