pitching the Gender Edit Dashboard

List overview All Threads
Download

newer

older

Anonymizing and releasing 'edits...

Report about pageview issues

Ryan Kaldari

25 Aug 2014 25 Aug '14

6:05 p.m.

I first pitched this idea to Aaron Halfaker in July, but nothing has happened so far, so I wanted to pitch it to the whole analytics team.... The Foundation has been discussing the gender gap and how to address it since I started 4 years ago. Often there is discussion of how particular features or projects might theoretically impact the gender gap: the Education Program, Visual Editor, WikiLove, editathons, etc. Unfortunately, we have absolutely no idea if any of these things have any impact. Nor do we have any idea if the gender gap is getting better or worse or staying the same. All we have is a handful of non-comparable data points based on surveys with different methodologies. The main obstacle to generating useful gender gap data has always been that we don't have reliable absolute numbers because editors do not reliably indicate their gender in the preferences. There is nothing stopping us, however, from analysing *relative* trends using existing data. For example, we could generate graphs showing the relative difference per month in edits by men and women and this data would be unaffected by the unreliability of the absolute numbers (since we would only be looking at changes in the percentages). This is possible right now with existing data and shouldn't be very hard to generate (although the queries will be expensive). To see a full explanation of the idea, please check out the Trello card and add comments there: https://trello.com/c/vLkEILa6/369-gender-edit-dashboard Ryan Kaldari

Attachments:

attachment.htm (text/html — 1.7 KB)

Show replies by thread

Steven Walling

25 Aug 25 Aug

6:41 p.m.

On Mon, Aug 25, 2014 at 11:05 AM, Ryan Kaldari <rkaldari(a)wikimedia.org> wrote:

...

There is nothing stopping us, however, from analysing *relative* trends using existing data. For example, we could generate graphs showing the relative difference per month in edits by men and women and this data would be unaffected by the unreliability of the absolute numbers (since we would only be looking at changes in the percentages).

Using bad data here is worse than having no data. As Aaron and I recommended when we talked in person, we should not invest is using the gendered language preference data to track overall gender among editors. It's a case of garbage in, garbage out. Instead, we should be investing in more reliable ways to track gender among the editor population, if it's a metric that we care about. -- Steven Walling, Product Manager https://wikimediafoundation.org/

Ryan Kaldari

7:21 p.m.

On Mon, Aug 25, 2014 at 11:41 AM, Steven Walling <swalling(a)wikimedia.org> wrote:

...

On Mon, Aug 25, 2014 at 11:05 AM, Ryan Kaldari <rkaldari(a)wikimedia.org> wrote:

Steven Walling

11:54 p.m.

On Mon, Aug 25, 2014 at 12:21 PM, Ryan Kaldari <rkaldari(a)wikimedia.org> wrote:

...

You can get accurate information from bad or incomplete data.

The issue is not merely that data are incomplete like your tides example, it's that it's biased in many ways we can't quantify. If we care enough about this to monitor it on an ongoing basis, we should just do a better job of collecting accurate information. -- Steven Walling, Product Manager https://wikimediafoundation.org/

Dan Garry

26 Aug 26 Aug

12:30 a.m.

The sea isn't biased about how it reports its tide level, though. :-) The gender preference is intermediate data generated by applying a function (i.e. gender self-reporting) to the real data (i.e. a user's gender). In order to draw conclusions on the intermediate data and claim that they apply to the raw data, we would need to understand the function used to generate the intermediate data, and we don't. As such, any conclusions we draw would be totally invalid. Dan On 25 August 2014 12:21, Ryan Kaldari <rkaldari(a)wikimedia.org> wrote:

...

On Mon, Aug 25, 2014 at 11:41 AM, Steven Walling <swalling(a)wikimedia.org> wrote:

On Mon, Aug 25, 2014 at 11:05 AM, Ryan Kaldari <rkaldari(a)wikimedia.org> wrote:

You can get accurate information from bad or incomplete data. For example, I can measure changes in tide levels without knowing the volume of the ocean. That's all I'm proposing doing here, measuring the change per month. Please take a look at the Trello card for a more complete description of the proposal. Ryan Kaldari _______________________________________________ Analytics mailing list Analytics(a)lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics

-- Dan Garry Associate Product Manager, Mobile Apps Wikimedia Foundation

Ryan Kaldari

1:10 a.m.

On Mon, Aug 25, 2014 at 4:54 PM, Steven Walling <swalling(a)wikimedia.org> wrote:

...

On Mon, Aug 25, 2014 at 12:21 PM, Ryan Kaldari <rkaldari(a)wikimedia.org> wrote:

You can get accurate information from bad or incomplete data.

The issue is not merely that data are incomplete like your tides example, it's that it's biased in many ways we can't quantify.

Yes, it's biased, but do we have any reason to think that this bias has changed significantly over time? If not, we can still derive some useful information from the dataset. Personally, I doubt that users change their gender setting very often, so even if the information is significantly incorrect, it's probably a relatively constant level of incorrectness. At the very least, it should give us an idea of which direction the gender gap is traveling in – is it increasing, decreasing, or staying relatively constant. I agree we could not draw any definite conclusions from such a graph, but it would at least give us some hints and maybe lead to some more interesting questions. We've had plenty of graphs and datasets in the past that we knew were biased, but we still wanted to look at anyway. If people don't think it's worth having in a dashboard, could we at least do a one time query and see if there's anything interesting in the data? Kaldari

Dan Garry

1:22 a.m.

Honestly, I disagree with pretty much everything you just said. Even if we assume the bias has remained the same, we still don't understand how it transforms the underlying data, and without that understanding any conclusions you draw will be totally invalid and tell you nothing about the gender gap. Dan On 25 August 2014 18:10, Ryan Kaldari <rkaldari(a)wikimedia.org> wrote:

...

On Mon, Aug 25, 2014 at 4:54 PM, Steven Walling <swalling(a)wikimedia.org> wrote:

On Mon, Aug 25, 2014 at 12:21 PM, Ryan Kaldari <rkaldari(a)wikimedia.org> wrote:

You can get accurate information from bad or incomplete data.

The issue is not merely that data are incomplete like your tides example, it's that it's biased in many ways we can't quantify.

-- Dan Garry Associate Product Manager, Mobile Apps Wikimedia Foundation

Steven Walling

2:08 a.m.

On Mon, Aug 25, 2014 at 6:10 PM, Ryan Kaldari <rkaldari(a)wikimedia.org> wrote:

...

Yes, it's biased, but do we have any reason to think that this bias has changed significantly over time?

Yes. For instance, the language that describes the preference has been completely changed over time. It's really two very different data sets, before and after. -- Steven Walling, Product Manager https://wikimediafoundation.org/

Ryan Kaldari

3:52 a.m.

On Mon, Aug 25, 2014 at 7:08 PM, Steven Walling <swalling(a)wikimedia.org> wrote:

...

On Mon, Aug 25, 2014 at 6:10 PM, Ryan Kaldari <rkaldari(a)wikimedia.org> wrote:

Yes, it's biased, but do we have any reason to think that this bias has changed significantly over time?

Yes. For instance, the language that describes the preference has been completely changed over time. It's really two very different data sets, before and after.

Then we could just look at how it's changed since the last change to the preference. Kaldari

Ryan Kaldari

4:30 a.m.

On Mon, Aug 25, 2014 at 6:22 PM, Dan Garry <dgarry(a)wikimedia.org> wrote:

...

I agree that drawing any conclusions would be very premature. I just want to see what the data looks like and if it suggests any trends or sudden changes that may have happened over the past few years. I think we can all agree that the gender preference has *some* relation to actual gender. If the percentage of edits that came from people who identified as female suddenly increased or decreased at some point in time, wouldn't that be interesting regardless of our uncertainty that 100% of those people are actually female? If the percentage of edits that came from people who identified as female had been decreasing steadily over the past year, wouldn't that also be interesting regardless? I'm not suggesting we publish a paper about it. I'm just suggesting we look at the shape of the data and see if it suggests more specific questions. Right now, though, we have no information to go on, even to make educated guesses. Part of scientific investigation is forming a hypothesis, but that's difficult to do when you don't even have anecdotal evidence. There's nothing wrong with beginning an investigation with imperfect data. That's how most investigations begin. Kaldari

Leila Zia

4:53 p.m.

Thanks for initiating this thread, Kaldari. On Mon, Aug 25, 2014 at 9:30 PM, Ryan Kaldari <rkaldari(a)wikimedia.org> wrote:

...

Part of scientific investigation is forming a hypothesis, but that's difficult to do when you don't even have anecdotal evidence. There's nothing wrong with beginning an investigation with imperfect data. That's how most investigations begin.

I agree with this. There are two paths we can undertake here: 1. We look at the self-reported gender data and do some simple observations. Pros: + we will have an updated view of the gender gap problem. + we may spread seeds for further internal and/or external research about it. Cons: - If simple observations are not communicated properly, they will result in misinformation, that can possibly do more harm than good. - The results will be very limited given that we know the data is very limited and contains biases. 2. We do extensive gender gap analysis internally. Proper gender gap analysis, in a way that can result in meaningful interventions (think products and features by us or the community) requires one person from R&D to work on it almost full time for a long period of time (at least six months, more probably a year). In this case, the question becomes: How should we prioritize this question? Just to give you some context: Which of the following areas should this one person from R&D work on? * reducing gender gap * increasing editor diversity in terms of nationality/language/... * increasing the number of active editors independent of gender * identifying areas Wikipedia is covered the least and finding editors who can contribute to those areas * ... I'd put this as a new request to Trello card and expand on it (what specific questions you're interested in? What do you think you or others want to do once you have the answer to those questions?). Then the team can prioritize given the other constraints we have. Leila

...

Kaldari _______________________________________________ Analytics mailing list Analytics(a)lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics

Ryan Kaldari

28 Aug 28 Aug

2:50 a.m.

On Tue, Aug 26, 2014 at 9:53 AM, Leila Zia <leila(a)wikimedia.org> wrote:

...

1. We look at the self-reported gender data and do some simple observations. Pros: + we will have an updated view of the gender gap problem. + we may spread seeds for further internal and/or external research about it. Cons: - If simple observations are not communicated properly, they will result in misinformation, that can possibly do more harm than good. - The results will be very limited given that we know the data is very limited and contains biases.

I would definitely like to avoid spreading misinformation, which is why I proposed only looking at the percentage change per month rather than raw numbers or raw percentages. The raw numbers are almost certainly off-base and would be much more likely to be latched onto by the public and the media. Percentage change per month is a less 'sexy' statistic, but might give us better clues about what's actually going on with the gender gap over time. It would also, for the first time, give us some window into how new features or issues may be actively affecting the gender gap. But again, it would only be a canary in a coal mine, not a tool to draw reliable conclusions from. For that, we need more extensive tools and analysis. 2. We do extensive gender gap analysis internally.

...

Proper gender gap analysis, in a way that can result in meaningful interventions (think products and features by us or the community) requires one person from R&D to work on it almost full time for a long period of time (at least six months, more probably a year). In this case, the question becomes: How should we prioritize this question? Just to give you some context: Which of the following areas should this one person from R&D work on? * reducing gender gap * increasing editor diversity in terms of nationality/language/... * increasing the number of active editors independent of gender * identifying areas Wikipedia is covered the least and finding editors who can contribute to those areas * ...

I think it's very difficult to judge how to set those priorities without having more data. We know that the active editors number is on a downward trajectory. Is the nationality/language diversity increasing or decreasing? Is the gender gap increasing or decreasing? In cases where things are actively getting worse, we should set our priorities to address them sooner, but without knowing those trajectories it's impossible to say. Kaldari

Aaron Halfaker

8:43 a.m.

I think the biggest problem is this: Let's say that we see the proportion of users who set their gender preference to female falling. Is that because women are becoming less likely to set their gender preference or because the ratio is actually becoming more extreme? Let's say that we see a trend in the messy data. What do we do about that? Do we assume that it is a change in the actual ratio? Do we assume that it is a change in the propensity of females to set their gender preference and there's nothing for us to do? Or do we then decide that it is important for us to gather good data so that we can actually know what's going on? -Aaron On Thu, Aug 28, 2014 at 4:50 AM, Ryan Kaldari <rkaldari(a)wikimedia.org> wrote:

...

On Tue, Aug 26, 2014 at 9:53 AM, Leila Zia <leila(a)wikimedia.org> wrote:

Ryan Kaldari

4:50 p.m.

We could restrict the query to only look at editors who had explicitly set their gender preference. That would control for changes in the rate of setting the preference. The data would then only be biased by users who had explicitly set their gender to the incorrect gender, which I imagine would be a very small percentage. Also, I would like to point out that even our most fundamental metrics are affected by similar biases and inconsistencies. For example, the rate of new editors is polluted by long-time IP editors who suddenly decide to create an account. If there is an increase in IP editors converting to registered editors, it can mislead us into thinking that we are suddenly attracting a lot of new editors. This is just one of many examples I'm sure you're already familiar with. To answer your question though, I think if we notice something interesting in the data (especially a downward trend), we would start a discussion about it (as we would with any interesting data) and hopefully inspire someone to dig deeper. Right now though we are mostly in the dark. See, for example, Phoebe's most recent email to the gendergap list lamenting the lack of research and data. Kaldari On Thu, Aug 28, 2014 at 1:43 AM, Aaron Halfaker <ahalfaker(a)wikimedia.org> wrote:

...

On Tue, Aug 26, 2014 at 9:53 AM, Leila Zia <leila(a)wikimedia.org> wrote:

_______________________________________________ Analytics mailing list Analytics(a)lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics

Ryan Kaldari

5:03 p.m.

...

On Tue, Aug 26, 2014 at 9:53 AM, Leila Zia <leila(a)wikimedia.org> wrote:

2. We do extensive gender gap analysis internally. Proper gender gap analysis, in a way that can result in meaningful interventions (think products and features by us or the community) requires one person from R&D to work on it almost full time for a long period of time (at least six months, more probably a year). In this case, the question becomes: How should we prioritize this question? Just to give you some context: Which of the following areas should this one person from R&D work on? * reducing gender gap * increasing editor diversity in terms of nationality/language/... * increasing the number of active editors independent of gender * identifying areas Wikipedia is covered the least and finding editors who can contribute to those areas * ...

_______________________________________________ Analytics mailing list Analytics(a)lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics

Jonathan Morgan

7:55 p.m.

Stepping back... We all seem to agree that user-set gender preference is a problematic measure. We don't trust it. We can come up with plausible hypotheses for why someone would mis-report their gender. And we can be almost certain it's not a representative sample. Do we have any ideas for what a *better* measure would be? Seems to me that we're dealing with self-report data no matter what. But perhaps a more explicit elicitation would be better? Folks have suggested a one-question gender microsurvey before. Of course that will come with its own sources of bias, and I don't quite see how we can control for them. Given that it would be useful to have some data on gendered editing patterns (whether we share it publicly or not), what are our options? - Jonathan On Thu, Aug 28, 2014 at 10:03 AM, Ryan Kaldari <rkaldari(a)wikimedia.org> wrote:

...

And because I know someone is going to point this out... Actually, restricting the data to only editors who have explicitly set their gender would not completely control for changes in the rate of setting the preference since that rate could change differently for men and women. It would at least help to control for overall changes in the rate, for example, due to the change in the interface that Steven mentioned. Kaldari On Aug 28, 2014, at 9:50 AM, Ryan Kaldari <rkaldari(a)wikimedia.org> wrote: We could restrict the query to only look at editors who had explicitly set their gender preference. That would control for changes in the rate of setting the preference. The data would then only be biased by users who had explicitly set their gender to the incorrect gender, which I imagine would be a very small percentage. Also, I would like to point out that even our most fundamental metrics are affected by similar biases and inconsistencies. For example, the rate of new editors is polluted by long-time IP editors who suddenly decide to create an account. If there is an increase in IP editors converting to registered editors, it can mislead us into thinking that we are suddenly attracting a lot of new editors. This is just one of many examples I'm sure you're already familiar with. To answer your question though, I think if we notice something interesting in the data (especially a downward trend), we would start a discussion about it (as we would with any interesting data) and hopefully inspire someone to dig deeper. Right now though we are mostly in the dark. See, for example, Phoebe's most recent email to the gendergap list lamenting the lack of research and data. Kaldari On Thu, Aug 28, 2014 at 1:43 AM, Aaron Halfaker <ahalfaker(a)wikimedia.org> wrote:

On Tue, Aug 26, 2014 at 9:53 AM, Leila Zia <leila(a)wikimedia.org> wrote:

_______________________________________________ Analytics mailing list Analytics(a)lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics

-- Jonathan T. Morgan Learning Strategist Wikimedia Foundation User:Jmorgan (WMF) <https://meta.wikimedia.org/wiki/User:Jmorgan_(WMF)> jmorgan(a)wikimedia.org

Andrew Gray

8:25 p.m.

I believe we did a one-question gender microsurvey before (linked to one of the new-user features?). I don't know whether the data was useful or not, but I do remember the act of asking the question itself got some pushback as being invasive/unwelcoming/weirdly communicated/etc. (and I can certainly symapthise with this) So as well as the value of the data, we should consider whether the act/method of asking is going to have knock-on effects on what we're trying to measure. Andrew. On 28 August 2014 20:55, Jonathan Morgan <jmorgan(a)wikimedia.org> wrote:

...

Stepping back... We all seem to agree that user-set gender preference is a problematic measure. We don't trust it. We can come up with plausible hypotheses for why someone would mis-report their gender. And we can be almost certain it's not a representative sample. Do we have any ideas for what a better measure would be? Seems to me that we're dealing with self-report data no matter what. But perhaps a more explicit elicitation would be better? Folks have suggested a one-question gender microsurvey before. Of course that will come with its own sources of bias, and I don't quite see how we can control for them. Given that it would be useful to have some data on gendered editing patterns (whether we share it publicly or not), what are our options? - Jonathan On Thu, Aug 28, 2014 at 10:03 AM, Ryan Kaldari <rkaldari(a)wikimedia.org> wrote:

On Tue, Aug 26, 2014 at 9:53 AM, Leila Zia <leila(a)wikimedia.org> wrote: > > 1. We look at the self-reported gender data and do some simple > observations. > Pros: > + we will have an updated view of the gender gap problem. > + we may spread seeds for further internal and/or external research > about it. > Cons: > - If simple observations are not communicated properly, they will > result in misinformation, that can possibly do more harm than good. > - The results will be very limited given that we know the data is > very limited and contains biases. I would definitely like to avoid spreading misinformation, which is why I proposed only looking at the percentage change per month rather than raw numbers or raw percentages. The raw numbers are almost certainly off-base and would be much more likely to be latched onto by the public and the media. Percentage change per month is a less 'sexy' statistic, but might give us better clues about what's actually going on with the gender gap over time. It would also, for the first time, give us some window into how new features or issues may be actively affecting the gender gap. But again, it would only be a canary in a coal mine, not a tool to draw reliable conclusions from. For that, we need more extensive tools and analysis. > 2. We do extensive gender gap analysis internally. > Proper gender gap analysis, in a way that can result in meaningful > interventions (think products and features by us or the community) requires > one person from R&D to work on it almost full time for a long period of time > (at least six months, more probably a year). In this case, the question > becomes: How should we prioritize this question? Just to give you some > context: Which of the following areas should this one person from R&D work > on? > * reducing gender gap > * increasing editor diversity in terms of nationality/language/... > * increasing the number of active editors independent of gender > * identifying areas Wikipedia is covered the least and finding > editors who can contribute to those areas > * ... I think it's very difficult to judge how to set those priorities without having more data. We know that the active editors number is on a downward trajectory. Is the nationality/language diversity increasing or decreasing? Is the gender gap increasing or decreasing? In cases where things are actively getting worse, we should set our priorities to address them sooner, but without knowing those trajectories it's impossible to say. Kaldari _______________________________________________ Analytics mailing list Analytics(a)lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics

_______________________________________________ Analytics mailing list Analytics(a)lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics

-- Jonathan T. Morgan Learning Strategist Wikimedia Foundation User:Jmorgan (WMF) jmorgan(a)wikimedia.org _______________________________________________ Analytics mailing list Analytics(a)lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics

-- - Andrew Gray andrew.gray(a)dunelm.org.uk

Steven Walling

8:31 p.m.

On Thu, Aug 28, 2014 at 1:25 PM, Andrew Gray <andrew.gray(a)dunelm.org.uk> wrote:

...

Yes, we've tried this once. It's documented at https://meta.wikimedia.org/wiki/Research:Gender_micro-survey Frankly this was pretty hackish and limited. The data is not garbage but it's not the most elegant approach to a one question survey. -- Steven Walling, Product Manager https://wikimediafoundation.org/

Ryan Kaldari

8:32 p.m.

The results of the microsurvey are at: https://meta.wikimedia.org/wiki/Research:Gender_micro-survey This was a survey of new account holders (not necessarily editors). The results were 67% male, 22% female, 11% prefer not to say. I think the survey was useful in that it let us know that the gender gap exists as early as the account sign-up funnel. Kaldari On Thu, Aug 28, 2014 at 1:25 PM, Andrew Gray <andrew.gray(a)dunelm.org.uk> wrote:

...

Stepping back... We all seem to agree that user-set gender preference is a problematic measure. We don't trust it. We can come up with plausible hypotheses for

why

someone would mis-report their gender. And we can be almost certain it's

not

a representative sample. Do we have any ideas for what a better measure would be? Seems to me that we're dealing with self-report data no matter what. But perhaps a more explicit elicitation would be better? Folks have suggested a

one-question

gender microsurvey before. Of course that will come with its own sources

bias, and I don't quite see how we can control for them. Given that it would be useful to have some data on gendered editing

patterns

(whether we share it publicly or not), what are our options? - Jonathan On Thu, Aug 28, 2014 at 10:03 AM, Ryan Kaldari <rkaldari(a)wikimedia.org> wrote: > > And because I know someone is going to point this out... Actually, > restricting the data to only editors who have explicitly set their

gender

> would not completely control for changes in the rate of setting the > preference since that rate could change differently for men and women.

> would at least help to control for overall changes in the rate, for

example,

> due to the change in the interface that Steven mentioned. > > Kaldari > > On Aug 28, 2014, at 9:50 AM, Ryan Kaldari <rkaldari(a)wikimedia.org>

wrote:

> > We could restrict the query to only look at editors who had explicitly

set

> their gender preference. That would control for changes in the rate of > setting the preference. The data would then only be biased by users who

had

> explicitly set their gender to the incorrect gender, which I imagine

would

> be a very small percentage. > > Also, I would like to point out that even our most fundamental metrics

are

> affected by similar biases and inconsistencies. For example, the rate

of new

> editors is polluted by long-time IP editors who suddenly decide to

create an

> account. If there is an increase in IP editors converting to registered > editors, it can mislead us into thinking that we are suddenly

attracting a

> lot of new editors. This is just one of many examples I'm sure you're > already familiar with. > > To answer your question though, I think if we notice something

interesting

> in the data (especially a downward trend), we would start a discussion

about

> it (as we would with any interesting data) and hopefully inspire

someone to

> dig deeper. Right now though we are mostly in the dark. See, for

example,

> Phoebe's most recent email to the gendergap list lamenting the lack of > research and data. > > Kaldari > > > On Thu, Aug 28, 2014 at 1:43 AM, Aaron Halfaker <

ahalfaker(a)wikimedia.org>

> wrote: >> >> I think the biggest problem is this: >> >> Let's say that we see the proportion of users who set their gender >> preference to female falling. Is that because women are becoming less >> likely to set their gender preference or because the ratio is actually >> becoming more extreme? >> >> Let's say that we see a trend in the messy data. What do we do about >> that? Do we assume that it is a change in the actual ratio? Do we

assume

>> that it is a change in the propensity of females to set their gender >> preference and there's nothing for us to do? Or do we then decide

that it

>> is important for us to gather good data so that we can actually know

what's

>> going on? >> >> -Aaron >> >> >> On Thu, Aug 28, 2014 at 4:50 AM, Ryan Kaldari <rkaldari(a)wikimedia.org> >> wrote: >>> >>> On Tue, Aug 26, 2014 at 9:53 AM, Leila Zia <leila(a)wikimedia.org>

wrote:

>>>> >>>> 1. We look at the self-reported gender data and do some simple >>>> observations. >>>> Pros: >>>> + we will have an updated view of the gender gap problem. >>>> + we may spread seeds for further internal and/or external

research

>>>> about it. >>>> Cons: >>>> - If simple observations are not communicated properly, they will >>>> result in misinformation, that can possibly do more harm than good. >>>> - The results will be very limited given that we know the data is >>>> very limited and contains biases. >>> >>> >>> I would definitely like to avoid spreading misinformation, which is

why

>>> I proposed only looking at the percentage change per month rather

than raw

>>> numbers or raw percentages. The raw numbers are almost certainly

off-base

>>> and would be much more likely to be latched onto by the public and the >>> media. Percentage change per month is a less 'sexy' statistic, but

might

>>> give us better clues about what's actually going on with the gender

gap over

>>> time. It would also, for the first time, give us some window into how

new

>>> features or issues may be actively affecting the gender gap. But

again, it

>>> would only be a canary in a coal mine, not a tool to draw reliable >>> conclusions from. For that, we need more extensive tools and analysis. >>> >>>> 2. We do extensive gender gap analysis internally. >>>> Proper gender gap analysis, in a way that can result in meaningful >>>> interventions (think products and features by us or the community)

requires

>>>> one person from R&D to work on it almost full time for a long period

of time

>>>> (at least six months, more probably a year). In this case, the

question

>>>> becomes: How should we prioritize this question? Just to give you

some

>>>> context: Which of the following areas should this one person from

R&D work

>>>> on? >>>> * reducing gender gap >>>> * increasing editor diversity in terms of nationality/language/... >>>> * increasing the number of active editors independent of gender >>>> * identifying areas Wikipedia is covered the least and finding >>>> editors who can contribute to those areas >>>> * ... >>> >>> >>> I think it's very difficult to judge how to set those priorities

without

>>> having more data. We know that the active editors number is on a

downward

>>> trajectory. Is the nationality/language diversity increasing or

decreasing?

>>> Is the gender gap increasing or decreasing? In cases where things are >>> actively getting worse, we should set our priorities to address them

sooner,

> but without knowing those trajectories it's impossible to say. > > Kaldari > > _______________________________________________ > Analytics mailing list > Analytics(a)lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/analytics > _______________________________________________ Analytics mailing list Analytics(a)lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics

_______________________________________________ Analytics mailing list Analytics(a)lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics

-- - Andrew Gray andrew.gray(a)dunelm.org.uk _______________________________________________ Analytics mailing list Analytics(a)lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics

Aaron Halfaker

29 Aug 29 Aug

8:03 a.m.

Two things: - Kaldari, I think I'm failing to communicate the propensity issue I was referring to earlier. Suffice it to say that we still have a propensity problem when we only look at those users who choose to set their gender preference. In fact, this was the scenario I had in mind when I brought up the problem. - J-Mo, I like the suggestion of micro-surveys. I made it to Kaldari earlier on the trello card. I agree that this might be confusing/concerning to our users. I wonder if we might explore ways to improve such a survey. For example, we might include the gender question in the signup form for a small percentage of newly registered users. I'm used to (optionally) setting my gender at signup so that the UI will use the right pronouns. -Aaron On Thu, Aug 28, 2014 at 10:32 PM, Ryan Kaldari <rkaldari(a)wikimedia.org> wrote:

...

Stepping back... We all seem to agree that user-set gender preference is a problematic measure. We don't trust it. We can come up with plausible hypotheses

for why

someone would mis-report their gender. And we can be almost certain

it's not

a representative sample. Do we have any ideas for what a better measure would be? Seems to me

that

we're dealing with self-report data no matter what. But perhaps a more explicit elicitation would be better? Folks have suggested a

one-question

gender microsurvey before. Of course that will come with its own

sources of

bias, and I don't quite see how we can control for them. Given that it would be useful to have some data on gendered editing

patterns

gender

> would not completely control for changes in the rate of setting the > preference since that rate could change differently for men and women.

> would at least help to control for overall changes in the rate, for

example,

> due to the change in the interface that Steven mentioned. > > Kaldari > > On Aug 28, 2014, at 9:50 AM, Ryan Kaldari <rkaldari(a)wikimedia.org>

wrote:

> > We could restrict the query to only look at editors who had explicitly

set

> their gender preference. That would control for changes in the rate of > setting the preference. The data would then only be biased by users

who had

> explicitly set their gender to the incorrect gender, which I imagine

would

> be a very small percentage. > > Also, I would like to point out that even our most fundamental metrics

are

> affected by similar biases and inconsistencies. For example, the rate

of new

> editors is polluted by long-time IP editors who suddenly decide to

create an

> account. If there is an increase in IP editors converting to registered > editors, it can mislead us into thinking that we are suddenly

attracting a

> lot of new editors. This is just one of many examples I'm sure you're > already familiar with. > > To answer your question though, I think if we notice something

interesting

> in the data (especially a downward trend), we would start a discussion

about

> it (as we would with any interesting data) and hopefully inspire

someone to

> dig deeper. Right now though we are mostly in the dark. See, for

example,

> Phoebe's most recent email to the gendergap list lamenting the lack of > research and data. > > Kaldari > > > On Thu, Aug 28, 2014 at 1:43 AM, Aaron Halfaker <

ahalfaker(a)wikimedia.org>

assume

>> that it is a change in the propensity of females to set their gender >> preference and there's nothing for us to do? Or do we then decide

that it

>> is important for us to gather good data so that we can actually know

what's

> going on? > > -Aaron > > > On Thu, Aug 28, 2014 at 4:50 AM, Ryan Kaldari <rkaldari(a)wikimedia.org

>> wrote: >>> >>> On Tue, Aug 26, 2014 at 9:53 AM, Leila Zia <leila(a)wikimedia.org>

wrote:

research

why

>>> I proposed only looking at the percentage change per month rather

than raw

>>> numbers or raw percentages. The raw numbers are almost certainly

off-base

>>> and would be much more likely to be latched onto by the public and

the

>>> media. Percentage change per month is a less 'sexy' statistic, but

might

>>> give us better clues about what's actually going on with the gender

gap over

>>> time. It would also, for the first time, give us some window into

how new

>>> features or issues may be actively affecting the gender gap. But

again, it

>>> would only be a canary in a coal mine, not a tool to draw reliable >>> conclusions from. For that, we need more extensive tools and

analysis.

>>> >>>> 2. We do extensive gender gap analysis internally. >>>> Proper gender gap analysis, in a way that can result in meaningful >>>> interventions (think products and features by us or the community)

requires

>>>> one person from R&D to work on it almost full time for a long

period of time

>>>> (at least six months, more probably a year). In this case, the

question

>>>> becomes: How should we prioritize this question? Just to give you

some

>>>> context: Which of the following areas should this one person from

R&D work

>>>> on? >>>> * reducing gender gap >>>> * increasing editor diversity in terms of

nationality/language/...

>>>> * increasing the number of active editors independent of gender >>>> * identifying areas Wikipedia is covered the least and finding >>>> editors who can contribute to those areas >>>> * ... >>> >>> >>> I think it's very difficult to judge how to set those priorities

without

>>> having more data. We know that the active editors number is on a

downward

>>> trajectory. Is the nationality/language diversity increasing or

decreasing?

>>> Is the gender gap increasing or decreasing? In cases where things are >>> actively getting worse, we should set our priorities to address them

sooner,

>> but without knowing those trajectories it's impossible to say. >> >> Kaldari >> >> _______________________________________________ >> Analytics mailing list >> Analytics(a)lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/analytics >> > > > _______________________________________________ > Analytics mailing list > Analytics(a)lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/analytics > _______________________________________________ Analytics mailing list Analytics(a)lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics

_______________________________________________ Analytics mailing list Analytics(a)lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics

Dan Andreescu

11:58 a.m.

...

- I wonder if we might explore ways to improve such a survey. For example, we might include the gender question in the signup form for a small percentage of newly registered users. This experiment sounds more useful than the current gender data. Over

time, it would also allow us to track retention rate by gender for those who answer the question.

Leila Zia

2:01 p.m.

On Fri, Aug 29, 2014 at 4:58 AM, Dan Andreescu <dandreescu(a)wikimedia.org> wrote:

...

time, it would also allow us to track retention rate by gender for those who answer the question.

...

_______________________________________________ Analytics mailing list Analytics(a)lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics

Kevin Leduc

6:48 p.m.

Does Comscore have any gender data from their panels? I think finding out more about the gender gap in editors is a scientific research project. It's not an easy problem to formulate and some thorough research and experimentation is needed. I'm not sure if pulling together some reports from data we have would be beneficial or actionable. If this pitch is meant for us to prioritize gender research and closing the gap, then let's have a discussion about that. How important is gender research relative to everything else we are doing at WMF? Is this something someone at a university would be willing to study? On Fri, Aug 29, 2014 at 7:01 AM, Leila Zia <leila(a)wikimedia.org> wrote:

...

On Fri, Aug 29, 2014 at 4:58 AM, Dan Andreescu <dandreescu(a)wikimedia.org> wrote:

time, it would also allow us to track retention rate by gender for those who answer the question.

_______________________________________________ Analytics mailing list Analytics(a)lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics

Dario Taraborelli

6:59 p.m.

I too recommend the use of micro-surveys. The full rationale is here [1] but one of the immediate benefits I see is the ability to randomly sample from the population of newly registered users. It shouldn’t be particularly hard to set up an ongoing gender micro-survey to collect this data over time (it’s more a question for UX/Product: would this interfere with the existing acquisition workflow). We can also trigger a micro-survey at the end of the edit funnel and measure user drop-off rate by (self-reported) gender. Product has concerns about adding extra fields to the signup screen: they may not be optimal from a UX perspective, but micro-surveys are the most flexible way of collecting this kind of demographic data without heavy MediaWiki engineering effort. Dario [1] http://www.mediawiki.org/wiki/Extension:GuidedTour/Microsurveys On Aug 29, 2014, at 7:01 AM, Leila Zia <leila(a)wikimedia.org> wrote:

...

On Fri, Aug 29, 2014 at 4:58 AM, Dan Andreescu <dandreescu(a)wikimedia.org> wrote: I wonder if we might explore ways to improve such a survey. For example, we might include the gender question in the signup form for a small percentage of newly registered users. This experiment sounds more useful than the current gender data. Over time, it would also allow us to track retention rate by gender for those who answer the question. +1 _______________________________________________ Analytics mailing list Analytics(a)lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics _______________________________________________ Analytics mailing list Analytics(a)lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics

Dario Taraborelli

31 Aug 31 Aug

4:45 p.m.

…meanwhile: http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0104880 (I reached out to Chato, Mayo and David to ask if they would like to present this work at the research showcase) Emotions under Discussion: Gender, Status and Communication in Online Collaboration Daniela Iosub, David Laniado, Carlos Castillo, Mayo Fuster Morell, Andreas Kaltenbrunner mail Published: August 20, 2014DOI: 10.1371/journal.pone.0104880 Background Despite the undisputed role of emotions in teamwork, not much is known about the make-up of emotions in online collaboration. Publicly available repositories of collaboration data, such as Wikipedia editor discussions, now enable the large-scale study of affect and dialogue in peer production. Methods We investigate the established Wikipedia community and focus on how emotion and dialogue differ depending on the status, gender, and the communication network of the editors who have written at least 100 comments on the English Wikipedia's article talk pages. Emotions are quantified using a word-based approach comparing the results of two predefined lexicon-based methods: LIWC and SentiStrength. Principal Findings We find that administrators maintain a rather neutral, impersonal tone, while regular editors are more emotional and relationship-oriented, that is, they use language to form and maintain connections to other editors. A persistent gender difference is that female contributors communicate in a manner that promotes social affiliation and emotional connection more than male editors, irrespective of their status in the community. Female regular editors are the most relationship-oriented, whereas male administrators are the least relationship-focused. Finally, emotional and linguistic homophily is prevalent: editors tend to interact with other editors having similar emotional styles (e.g., editors expressing more anger connect more with one another). On Aug 29, 2014, at 11:59 AM, Dario Taraborelli <dtaraborelli(a)wikimedia.org> wrote:

...

3526

days inactive

3532

days old

analytics@lists.wikimedia.org

Manage subscription

24 comments

10 participants

tags (0)

participants (10)

Aaron Halfaker
Andrew Gray
Dan Andreescu
Dan Garry
Dario Taraborelli
Jonathan Morgan
Kevin Leduc
Leila Zia
Ryan Kaldari
Steven Walling