Wow, Kerry! Thank you for taking the time to write all these thoughts out.
I'm asking the question because I'm concerned that the gender balance of the authors being cited on wikipedia is different from the already quite bad patterns in academia. My fear is that the citation gender imbalance on Wikipedia is more pronounced. If so, it is not just perpetuating the problem, but making it worse by surfacing certain authors and ideas even more frequently, or hardly at all. I would like to know if this is the case, and if so, how big the effect is.
In my last message, I mention a study about a set of award-winning political science books (the researchers study the citation gender imbalance for that set). I just saw this study today, but I began to think that it/the set of works--or some similar set of titles--could possibly be a good place to begin, especially if the original researchers were willing to share the list of titles/authors/gender/etc that they put together/worked with. Then it seems it would mostly be a matter of figuring out how to understand how those titles are cited on Wikipedia--through either the citation dataset or wikicite--to see if/how the citation patterns differ (i.e., if the works by women/men are cited more frequently/at the same rate/less frequently on Wikipedia than what the researchers found in the original study).
This seems like it would be easier to do than what you propose, but perhaps the idea is not sound. Until very recently, I thought I could find the answer in an existing paper! I honestly don't know the best way to get the answer, but I would like to know the answer and think it's important to look at.
All of the things you bring up--from the gender of the editor, to the type of editing being done, to the issues around multiple authors/paywalls/year of publication/field--complicate the inquiry, and in particular a larger one. I agree with what you say about doing something small first to see what's there.
Thanks again for all your thoughts. Greg
On Thu, Aug 22, 2019 at 9:41 PM wiki-research-l-request@lists.wikimedia.org wrote:
Send Wiki-research-l mailing list submissions to wiki-research-l@lists.wikimedia.org
To subscribe or unsubscribe via the World Wide Web, visit https://lists.wikimedia.org/mailman/listinfo/wiki-research-l or, via email, send a message with subject or body 'help' to wiki-research-l-request@lists.wikimedia.org
You can reach the person managing the list at wiki-research-l-owner@lists.wikimedia.org
When replying, please edit your Subject line so it is more specific than "Re: Contents of Wiki-research-l digest..."
Today's Topics:
- Re: gender balance of wikipedia citations (Greg)
- Re: gender balance of wikipedia citations (Kerry Raymond)
Message: 1 Date: Thu, 22 Aug 2019 18:47:48 -0700 From: Greg thenatureprogram@gmail.com To: wiki-research-l@lists.wikimedia.org Subject: Re: [Wiki-research-l] gender balance of wikipedia citations Message-ID: < CAOO9DNvBrw_aLkRUp5kYFLdaLJUEK+ddiz-A09MZwiotAdAmUw@mail.gmail.com> Content-Type: text/plain; charset="UTF-8"
Hi Leila,
Thanks for your thoughts.
Having just read Troy Vettese's very powerful essay, Sexism in the Academy ( https://nplusonemag.com/issue-34/essays/sexism-in-the-academy/), I wish this were a top priority.
I stumbled upon a study today--it came up in the Washington Post's excellent series on gender bias in political science. The authors look at a set of award winning political science books and the gender imbalance in the citations drawn from google scholar. I'm linking the piece here in case anyone on this list is interested now, or in the future, in how the patterns on Wikipedia compare.
Washington Post piece: "There’s a gender gap in who wins political science book awards – and in how widely they’re cited"
https://www.washingtonpost.com/politics/2019/08/22/theres-gender-gap-who-win... "Just as significantly, women’s award-winning books receive fewer scholarly citations than men’s award-winning volumes — and this disparity has grown, rather than shrunk, in recent years. Over the entire period, APSA award-winning volumes by women averaged 43 percent fewer citations per year than those by male authors."
Paper: "Winning awards and gaining recognition: An impact analysis of APSA section book prizes" https://www.sciencedirect.com/science/article/abs/pii/S0362331918300867
Best, Greg
On Thu, Aug 22, 2019 at 3:44 PM < wiki-research-l-request@lists.wikimedia.org> wrote:
Send Wiki-research-l mailing list submissions to wiki-research-l@lists.wikimedia.org
To subscribe or unsubscribe via the World Wide Web, visit https://lists.wikimedia.org/mailman/listinfo/wiki-research-l or, via email, send a message with subject or body 'help' to wiki-research-l-request@lists.wikimedia.org
You can reach the person managing the list at wiki-research-l-owner@lists.wikimedia.org
When replying, please edit your Subject line so it is more specific than "Re: Contents of Wiki-research-l digest..."
Today's Topics:
- Re: gender balance of wikipedia citations (Greg)
- Re: gender balance of wikipedia citations (Leila Zia)
- Wikimania 2019 disinformation meetup follow-up (Leila Zia)
- Upcoming Research Newsletter (special issue on gender gap research): New papers open for review (Mohammed Sadat Abdulai)
Message: 1 Date: Thu, 22 Aug 2019 09:57:15 -0700 From: Greg thenatureprogram@gmail.com To: wiki-research-l@lists.wikimedia.org Subject: Re: [Wiki-research-l] gender balance of wikipedia citations Message-ID: <CAOO9DNuSYzzaVwcdqiWA7pj671z3N43XOSwv6DtW0SxWg= L8GQ@mail.gmail.com> Content-Type: text/plain; charset="UTF-8"
Hi Kerry, Those are all very interesting ways to look at this. I was thinking
mostly
along the lines of your first bullet point, but I'd be interested in research in any of those areas.
Thanks, Greg
On Thu, Aug 22, 2019 at 5:00 AM < wiki-research-l-request@lists.wikimedia.org> wrote:
Send Wiki-research-l mailing list submissions to wiki-research-l@lists.wikimedia.org
To subscribe or unsubscribe via the World Wide Web, visit https://lists.wikimedia.org/mailman/listinfo/wiki-research-l or, via email, send a message with subject or body 'help' to wiki-research-l-request@lists.wikimedia.org
You can reach the person managing the list at wiki-research-l-owner@lists.wikimedia.org
When replying, please edit your Subject line so it is more specific than "Re: Contents of Wiki-research-l digest..."
Today's Topics:
- gender balance of wikipedia citations (Greg)
- Re: gender balance of wikipedia citations (Kerry Raymond)
Message: 1 Date: Wed, 21 Aug 2019 20:19:18 -0700 From: Greg thenatureprogram@gmail.com To: wiki-research-l@lists.wikimedia.org Subject: [Wiki-research-l] gender balance of wikipedia citations Message-ID: < CAOO9DNtY+oDO5oQrMZeG1NZE-kYNYLWnTD6acHeYTbYeGk8k2Q@mail.gmail.com> Content-Type: text/plain; charset="UTF-8"
Greetings!
I was looking for information about the gender balance of Wikipedia citations and no one I've asked knows of any work on this topic. Do
you?
I think this is an important question.
Here's what I've learned so far:
Wikipedia citations are currently in the form of text strings. There is also an initiative to place citations in an annotated structured
repository
(wikicite). I do not know the current status of wikicite or if/when
this
could be used for this inquiry--either to examine all, or a sensible
subset
of the citations.
My perspective is that understanding the gender balance is necessary
and
urgent. The balance could be better, the same, or worse than the
citation
balances we already know, and the scale of the effect is quite large.
Is this a line of inquiry that the wikimedia/wikicite community is interested in pursuing? If so, what is the best way to get started?
Does
the WMF have the resources and interest to look into this matter
inhouse?
Thanks for your thoughts.
Greg
Message: 2 Date: Thu, 22 Aug 2019 13:53:45 +1000 From: "Kerry Raymond" kerry.raymond@gmail.com To: "'Research into Wikimedia content and communities'" wiki-research-l@lists.wikimedia.org Subject: Re: [Wiki-research-l] gender balance of wikipedia citations Message-ID: 00ed01d5589d$33e31ed0$9ba95c70$@gmail.com Content-Type: text/plain; charset="UTF-8"
Could you elaborate a bit more on what you mean by the gender balance
of
citations?
Are you talking about:
- proportion of male vs female authors of the source material used as
citations in arbitrary articles>
- the quality/quantity of citations in biography articles of men vs
women?
- the quality/quantity of citations in articles that are gendered by
some
other criteria (e.g. reader interest, romantic comedy vs action film)?
Kerry
-----Original Message----- From: Wiki-research-l [mailto:
wiki-research-l-bounces@lists.wikimedia.org]
On Behalf Of Greg Sent: Thursday, 22 August 2019 1:19 PM To: wiki-research-l@lists.wikimedia.org Subject: [Wiki-research-l] gender balance of wikipedia citations
Greetings!
I was looking for information about the gender balance of Wikipedia citations and no one I've asked knows of any work on this topic. Do
you?
I think this is an important question.
Here's what I've learned so far:
Wikipedia citations are currently in the form of text strings. There is also an initiative to place citations in an annotated structured
repository
(wikicite). I do not know the current status of wikicite or if/when
this
could be used for this inquiry--either to examine all, or a sensible
subset
of the citations.
My perspective is that understanding the gender balance is necessary
and
urgent. The balance could be better, the same, or worse than the
citation
balances we already know, and the scale of the effect is quite large.
Is this a line of inquiry that the wikimedia/wikicite community is interested in pursuing? If so, what is the best way to get started?
Does
the WMF have the resources and interest to look into this matter
inhouse?
Thanks for your thoughts.
Greg _______________________________________________ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Subject: Digest Footer
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
End of Wiki-research-l Digest, Vol 168, Issue 11
Message: 2 Date: Thu, 22 Aug 2019 10:43:51 -0700 From: Leila Zia leila@wikimedia.org To: Research into Wikimedia content and communities wiki-research-l@lists.wikimedia.org Subject: Re: [Wiki-research-l] gender balance of wikipedia citations Message-ID: <CAK0Oe2uCo70_=ma2b=2d+fvr4GseEVxOP0sh= ELNOpKdCuUfqA@mail.gmail.com> Content-Type: text/plain; charset="UTF-8"
Hi Greg,
A few comments if you're going to go with "proportion of male vs female authors of the source material used as citations in arbitrary articles":
- Please differentiate between sex (female, male, ...) and gender
(woman, man, ...). My understanding from your initial email is that you want to stay focused on gender, not sex.
- Unless you have reliable sources about the gender of an author, I
would not recommend trying to predict what the gender is. (As you may know, this is not uncommon in social media studies, for example, to predict the gender of the author based on their image or their name. These approaches introduce biases and social challenges.)
- Re your question about whether WMF has resources to look into this
question in-house: I can't speak for the whole of WMF, however, I can share more about the Research team's direction. As part of our future work, we would like to "help contributors monitor violations of core content policies and assess information reliability and bias both granularly and at scale". [1] The question you proposed can fall under assessing bias in content (considering citations as part of the content). I expect us to focus first on the piece about violations of core content policies and information reliability and come back to the bias question later. As a result, we won't have bandwidth to do your proposal in-house at the moment. Sorry about that.
I hope this helps.
Best, Leila
[1] Section 2 of our Knowledge Integrity whitepaper:
https://upload.wikimedia.org/wikipedia/commons/9/9a/Knowledge_Integrity_-_Wi...
On Thu, Aug 22, 2019 at 9:57 AM Greg thenatureprogram@gmail.com wrote:
Hi Kerry, Those are all very interesting ways to look at this. I was thinking
mostly
along the lines of your first bullet point, but I'd be interested in research in any of those areas.
Thanks, Greg
On Thu, Aug 22, 2019 at 5:00 AM <
wiki-research-l-request@lists.wikimedia.org>
wrote:
Send Wiki-research-l mailing list submissions to wiki-research-l@lists.wikimedia.org
To subscribe or unsubscribe via the World Wide Web, visit https://lists.wikimedia.org/mailman/listinfo/wiki-research-l or, via email, send a message with subject or body 'help' to wiki-research-l-request@lists.wikimedia.org
You can reach the person managing the list at wiki-research-l-owner@lists.wikimedia.org
When replying, please edit your Subject line so it is more specific than "Re: Contents of Wiki-research-l digest..."
Today's Topics:
- gender balance of wikipedia citations (Greg)
- Re: gender balance of wikipedia citations (Kerry Raymond)
Message: 1 Date: Wed, 21 Aug 2019 20:19:18 -0700 From: Greg thenatureprogram@gmail.com To: wiki-research-l@lists.wikimedia.org Subject: [Wiki-research-l] gender balance of wikipedia citations Message-ID: < CAOO9DNtY+oDO5oQrMZeG1NZE-kYNYLWnTD6acHeYTbYeGk8k2Q@mail.gmail.com> Content-Type: text/plain; charset="UTF-8"
Greetings!
I was looking for information about the gender balance of Wikipedia citations and no one I've asked knows of any work on this topic. Do
you?
I think this is an important question.
Here's what I've learned so far:
Wikipedia citations are currently in the form of text strings. There
is
also an initiative to place citations in an annotated structured
repository
(wikicite). I do not know the current status of wikicite or if/when
this
could be used for this inquiry--either to examine all, or a sensible
subset
of the citations.
My perspective is that understanding the gender balance is necessary
and
urgent. The balance could be better, the same, or worse than the
citation
balances we already know, and the scale of the effect is quite large.
Is this a line of inquiry that the wikimedia/wikicite community is interested in pursuing? If so, what is the best way to get started?
Does
the WMF have the resources and interest to look into this matter
inhouse?
Thanks for your thoughts.
Greg
Message: 2 Date: Thu, 22 Aug 2019 13:53:45 +1000 From: "Kerry Raymond" kerry.raymond@gmail.com To: "'Research into Wikimedia content and communities'" wiki-research-l@lists.wikimedia.org Subject: Re: [Wiki-research-l] gender balance of wikipedia citations Message-ID: 00ed01d5589d$33e31ed0$9ba95c70$@gmail.com Content-Type: text/plain; charset="UTF-8"
Could you elaborate a bit more on what you mean by the gender balance
of
citations?
Are you talking about:
- proportion of male vs female authors of the source material used as
citations in arbitrary articles>
- the quality/quantity of citations in biography articles of men vs
women?
- the quality/quantity of citations in articles that are gendered by
some
other criteria (e.g. reader interest, romantic comedy vs action
film)?
Kerry
-----Original Message----- From: Wiki-research-l [mailto:
wiki-research-l-bounces@lists.wikimedia.org]
On Behalf Of Greg Sent: Thursday, 22 August 2019 1:19 PM To: wiki-research-l@lists.wikimedia.org Subject: [Wiki-research-l] gender balance of wikipedia citations
Greetings!
I was looking for information about the gender balance of Wikipedia citations and no one I've asked knows of any work on this topic. Do
you?
I think this is an important question.
Here's what I've learned so far:
Wikipedia citations are currently in the form of text strings. There
is
also an initiative to place citations in an annotated structured
repository
(wikicite). I do not know the current status of wikicite or if/when
this
could be used for this inquiry--either to examine all, or a sensible
subset
of the citations.
My perspective is that understanding the gender balance is necessary
and
urgent. The balance could be better, the same, or worse than the
citation
balances we already know, and the scale of the effect is quite large.
Is this a line of inquiry that the wikimedia/wikicite community is interested in pursuing? If so, what is the best way to get started?
Does
the WMF have the resources and interest to look into this matter
inhouse?
Thanks for your thoughts.
Greg _______________________________________________ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Subject: Digest Footer
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
End of Wiki-research-l Digest, Vol 168, Issue 11
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Message: 3 Date: Thu, 22 Aug 2019 13:36:17 -0700 From: Leila Zia leila@wikimedia.org To: Research into Wikimedia content and communities wiki-research-l@lists.wikimedia.org Subject: [Wiki-research-l] Wikimania 2019 disinformation meetup follow-up Message-ID: <CAK0Oe2sodYJpkuhSqgo3dtfDr= NQ5EK1TdH16F6BOkTyFho9Rg@mail.gmail.com> Content-Type: text/plain; charset="UTF-8"
Hi,
This message is for those of you who attended the disinformation meet-up [0] in Wikimania 2019 [1] or others who may be interested.
- The notes from our meet-up are now posted in the bottom of the page
[0].
- I was tasked to see if space.wmflabs.org is the place for us to
continue conversations about this topic. The answer is yes. Thanks to the help of Elena Lappen, we now have a dedicated subcategory for disinformation: https://discuss-space.wmflabs.org/c/research/disinformation . Feel free to subscribe, watch, and/or post new topics if you're involved in this space.
- If you are new to this conversation, please read the purpose of the
subcategory at
https://discuss-space.wmflabs.org/t/about-the-disinformation-category/949
and welcome! :)
Best, Leila
[0] https://wikimania.wikimedia.org/wiki/2019:Meetups/Disinformation [1] https://wikimania.wikimedia.org/wiki/2019:Program
Message: 4 Date: Thu, 22 Aug 2019 22:43:53 +0000 (UTC) From: Mohammed Sadat Abdulai masssly@ymail.com To: Research Into Wikimedia Content and Communities wiki-research-l@lists.wikimedia.org Subject: [Wiki-research-l] Upcoming Research Newsletter (special issue on gender gap research): New papers open for review Message-ID: 1625269943.668598.1566513833343@mail.yahoo.com Content-Type: text/plain; charset=UTF-8
Hi everyone, We’re preparing for the August 2019 research newsletter and looking for contributors. Please take a look at https://etherpad.wikimedia.org/p/WRN201908 and add your name next to any paper you are interested in covering. Our target publication date is on
31
August 11:59 UTC. As usual, short notes and one-paragraph reviews are
most
welcome. For the August edition, we are planning a special issue focusing mainly on recent gender gap/gender bias research. (Upcoming special issues
topics
may include health and education.) There are about 20 papers from this
area
on our todo list which will all be covered in the August issue, either
as a
mere list item or - with your help - in form of a more informative
writeup
or review. They include:
Analyzing Gender Stereotyping in Bollywood Movies
Breaking the glass ceiling on Wikipedia| journal
Breastfeeding, Authority, and Genre: Women's Ethos in Wikipedia and
Blogs
- Cyberfeminism on Wikipedia: Visibility and deliberation in feminist
Wikiprojects
Gender and deletion on Wikipedia
Gender imbalance and Wikipedia
Gender Markers in Wikipedia Usernames
How do students trust Wikipedia? An examination across genders
Investigating the Gender Pronoun Gap in Wikipedia
It’s Not What You Think: Gender Bias in Information about Fortune
1000 CEOs on Wikipedia
- Mapping and Bridging the Gender Gap: An Ethnographic Study of Indian
Wikipedians and Their Motivations to Contribute
- People Who Can Take It: How Women Wikipedians Negotiate and Navigate
Safety
Redressing Gender Inequities on Wikipedia Through an Editathon
Similar Gaps, Different Origins? Women Readers and Editors at Greek
Wikipedia
- Simulation Experiments on (the Absence of) Ratings Bias in
Reputation
Systems
The Gendered Presentation of Professions on Wikipedia
Who Counts as a Notable Sociologist on Wikipedia? Gender, Race, and
the “Professor Test”
- Who Wants to Read This?: A Method for Measuring Topical
Representativeness in User Generated Content Systems
- Women and Wikipedia. Diversifying Editors and Enhancing Content
through Library Edit-a-Thons
Masssly and Tilman Bayer
[1] Research:Newsletter - Meta[2] WikiResearch (@WikiResearch) on Twitter
Subject: Digest Footer
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
End of Wiki-research-l Digest, Vol 168, Issue 12
Message: 2 Date: Fri, 23 Aug 2019 14:41:09 +1000 From: "Kerry Raymond" kerry.raymond@gmail.com To: "'Research into Wikimedia content and communities'" wiki-research-l@lists.wikimedia.org Subject: Re: [Wiki-research-l] gender balance of wikipedia citations Message-ID: 001001d5596c$fe22a100$fa67e300$@gmail.com Content-Type: text/plain; charset="utf-8"
Yes, that was my thought. It would be difficult to know the sex (or the gender) of an author name on a paper. There would inevitably be a lot that you could not determine. And certainly in the sciences multi-author pages are the norm and even where you did know the sex/gender of all, do you assign some part-score? E.g. 0 for all male, 1 for all female, 0.6 for 3 women and 2 men.
But I am curious why you are asking the question? That the writing/research of women is under-represented in Wikipedia citations? If so, without conducting any research, I'd say "yes it is under-represented". But my reason would be because women are under-represented as writers/researchers in the first place. And certainly the older the source, the more likely it is to be written by a man. So to investigate gender bias in citations in Wikipedia, you would have to estimate the proportion of men/women (or at least their outputs) over time in a given discipline and then ask the question, "taking into account of the time of publication of a citation and the proportion of men/women active in this discipline at that time, do Wikipedia citations show a sex/gender basis?". Hmm ... very tricky.
I'd be inclined to suggest starting with a much simpler task. Pick a discipline (preferably one with a professional society who can tell your their estimate of current male/female ratio over (say) the past 5 years), limit the Wikipedia articles to topics in that discipline, and limit the citations to those published within the last 5 years. Indeed, perhaps limiting it to publications that are principally from the same country(s) as the professional society from which you get the data (as clearly men/women's participation in any discipline can vary with different countries for cultural reasons). Then you have some way to gauge whether Wikipedia is showing more or less gender bias in its citations than the discipline itself exhibits through publication. Quite a challenge!
And of course, it is not Wikipedia that adds citations. It is individual contributor who add citations. Does the sex/gender of the contributor have any correlation to any observed bias? Again, the task is made more difficult because a lot of Wikipedians don't identify their sex/gender.
The other thing to be alert to is the difference in how (I believe) Wikipedians cite compared to researchers. As a researcher, I will of course be reading papers in my field all the time and what I read will influence my subsequent work. Therefore when I write about my research, my citations are referring to papers that I have already read and whose authors may be familiar to me from their other work, having met them at a conferences, private correspondence, etc. However as a Wikipedian, I am only partially operating that way (mostly when I write new articles or significantly expand them, that is, when I am doing the research). A lot of the time I am adding citations relating to content other people (often new users) have added/changed without citation. These come up on my watchlist all the time. What do I do? Of course I could revert saying "no citation provided", but that's not the way to encourage new contributors nor to grow the encyclopedia, so if the information seems plausible (not obviously vandalism), I will attempt to find a citation for it (using tools like Google and other topic-specialise search tools). This is what I call "lucky dip" mode of citing as obviously I have no idea what the source was for the original contributor. The sources I find from my search may not already be known to me (frequently they are not). Or to summarise, IMHO, researchers (or Wikipedians in "new content mode") cite a source already known to them and whose authors may be known to them and could consciously or unconsciously engage in some discrimination in citation based on sex/gender or other criteria, whereas Wikipedians in "updating mode" are likely to be citing a source not previously known to them and may be happy just to have found a source and are unlikely to be spending a lot of their time researching the authors of that source to be extent they could then consciously or unconsciously exercise discrimination on sex/gender. If I invest any extra effort in such a situations, it's probably because the wording of the source is a close match to the Wikipedia article which begs the question of copyright violation (which needs to be dealt with by deletion or rewriting) or being a Wikipedia mirror (which is obviously not an acceptable citation).
So I suspect whether a citation was added by the same contributor as the content it supports or a subsequent contributor probably makes a difference to the likelihood of conscious/unconscious discrimination.
Also, finally, often Wikipedia cites web pages and other sources that do not have any individual authorship, e.g. government websites. Remember that Wikipedia prefers open citations over paywalled citations and a lot of the publications behind paywalls are individually authored.
Your proposed research has a lot of interesting challenges and a number of limitations. I'm not saying don't do it, but I am saying start very small and see if you can find any evidence to support your hypothesis before embarking on a larger study. Because contributor behaviour is what you are trying to study, you probably need to do both quantitative and qualitative experiments. E.g. I have described the two modes of citation I do, but I cannot say how typical my behaviour is.
Kerry
-----Original Message----- From: Wiki-research-l [mailto:wiki-research-l-bounces@lists.wikimedia.org] On Behalf Of Leila Zia Sent: Friday, 23 August 2019 3:44 AM To: Research into Wikimedia content and communities < wiki-research-l@lists.wikimedia.org> Subject: Re: [Wiki-research-l] gender balance of wikipedia citations
Hi Greg,
A few comments if you're going to go with "proportion of male vs female authors of the source material used as citations in arbitrary articles":
- Please differentiate between sex (female, male, ...) and gender (woman,
man, ...). My understanding from your initial email is that you want to stay focused on gender, not sex.
- Unless you have reliable sources about the gender of an author, I would
not recommend trying to predict what the gender is. (As you may know, this is not uncommon in social media studies, for example, to predict the gender of the author based on their image or their name. These approaches introduce biases and social challenges.)
- Re your question about whether WMF has resources to look into this
question in-house: I can't speak for the whole of WMF, however, I can share more about the Research team's direction. As part of our future work, we would like to "help contributors monitor violations of core content policies and assess information reliability and bias both granularly and at scale". [1] The question you proposed can fall under assessing bias in content (considering citations as part of the content). I expect us to focus first on the piece about violations of core content policies and information reliability and come back to the bias question later. As a result, we won't have bandwidth to do your proposal in-house at the moment. Sorry about that.
I hope this helps.
Best, Leila
[1] Section 2 of our Knowledge Integrity whitepaper:
https://upload.wikimedia.org/wikipedia/commons/9/9a/Knowledge_Integrity_-_Wi...
On Thu, Aug 22, 2019 at 9:57 AM Greg thenatureprogram@gmail.com wrote:
Hi Kerry, Those are all very interesting ways to look at this. I was thinking mostly along the lines of your first bullet point, but I'd be interested in research in any of those areas.
Thanks, Greg
On Thu, Aug 22, 2019 at 5:00 AM wiki-research-l-request@lists.wikimedia.org wrote:
Send Wiki-research-l mailing list submissions to wiki-research-l@lists.wikimedia.org
To subscribe or unsubscribe via the World Wide Web, visit https://lists.wikimedia.org/mailman/listinfo/wiki-research-l or, via email, send a message with subject or body 'help' to wiki-research-l-request@lists.wikimedia.org
You can reach the person managing the list at wiki-research-l-owner@lists.wikimedia.org
When replying, please edit your Subject line so it is more specific than "Re: Contents of Wiki-research-l digest..."
Today's Topics:
- gender balance of wikipedia citations (Greg)
- Re: gender balance of wikipedia citations (Kerry Raymond)
--
Message: 1 Date: Wed, 21 Aug 2019 20:19:18 -0700 From: Greg thenatureprogram@gmail.com To: wiki-research-l@lists.wikimedia.org Subject: [Wiki-research-l] gender balance of wikipedia citations Message-ID: < CAOO9DNtY+oDO5oQrMZeG1NZE-kYNYLWnTD6acHeYTbYeGk8k2Q@mail.gmail.com> Content-Type: text/plain; charset="UTF-8"
Greetings!
I was looking for information about the gender balance of Wikipedia citations and no one I've asked knows of any work on this topic. Do
you?
I think this is an important question.
Here's what I've learned so far:
Wikipedia citations are currently in the form of text strings. There is also an initiative to place citations in an annotated structured repository (wikicite). I do not know the current status of wikicite or if/when this could be used for this inquiry--either to examine all, or a sensible subset of the citations.
My perspective is that understanding the gender balance is necessary and urgent. The balance could be better, the same, or worse than the citation balances we already know, and the scale of the
effect is quite large.
Is this a line of inquiry that the wikimedia/wikicite community is interested in pursuing? If so, what is the best way to get started? Does the WMF have the resources and interest to look into this matter
inhouse?
Thanks for your thoughts.
Greg
Message: 2 Date: Thu, 22 Aug 2019 13:53:45 +1000 From: "Kerry Raymond" kerry.raymond@gmail.com To: "'Research into Wikimedia content and communities'" wiki-research-l@lists.wikimedia.org Subject: Re: [Wiki-research-l] gender balance of wikipedia citations Message-ID: 00ed01d5589d$33e31ed0$9ba95c70$@gmail.com Content-Type: text/plain; charset="UTF-8"
Could you elaborate a bit more on what you mean by the gender balance of citations?
Are you talking about:
- proportion of male vs female authors of the source material used
as citations in arbitrary articles>
- the quality/quantity of citations in biography articles of men vs
women?
- the quality/quantity of citations in articles that are gendered by
some other criteria (e.g. reader interest, romantic comedy vs action
film)?
Kerry
-----Original Message----- From: Wiki-research-l [mailto:wiki-research-l-bounces@lists.wikimedia.org] On Behalf Of Greg Sent: Thursday, 22 August 2019 1:19 PM To: wiki-research-l@lists.wikimedia.org Subject: [Wiki-research-l] gender balance of wikipedia citations
Greetings!
I was looking for information about the gender balance of Wikipedia citations and no one I've asked knows of any work on this topic. Do
you?
I think this is an important question.
Here's what I've learned so far:
Wikipedia citations are currently in the form of text strings. There is also an initiative to place citations in an annotated structured repository (wikicite). I do not know the current status of wikicite or if/when this could be used for this inquiry--either to examine all, or a sensible subset of the citations.
My perspective is that understanding the gender balance is necessary and urgent. The balance could be better, the same, or worse than the citation balances we already know, and the scale of the
effect is quite large.
Is this a line of inquiry that the wikimedia/wikicite community is interested in pursuing? If so, what is the best way to get started? Does the WMF have the resources and interest to look into this matter
inhouse?
Thanks for your thoughts.
Greg _______________________________________________ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Subject: Digest Footer
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
End of Wiki-research-l Digest, Vol 168, Issue 11
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Subject: Digest Footer
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
End of Wiki-research-l Digest, Vol 168, Issue 13
Dear Greg,
The Wikimedia foundation and various chapters have microgrant programs and an online library. In part this is to counter the projects known bias towards free online sources. If you can come up with software that identifies sources that we aren't using but should then that would make for some interesting reports on Wikiprojects, or an interesting opportunity for the wiki library.
I'd particularly like to see something along the lines of a bot that sends messages to active Wikipedia editors "As an editor who has been active in topic zzzzzz we would like to send you a free copy of the new book xxxxxx by yyyyy. Click here to arrange your free copy"
There might be a little grumbling if there was something in the algorithm that meant that half of such gifts happened to have female authors, but I suspect only a little as long as the books were being offered to currently active editors who actually write content, and there was an option for them to say "actually that's not relevant to my current interests, can I have a copy of x instead?" even the more cynical and jaded members of the community would accept that as the foundation trying to do something useful for once.
My experience is that Wiikipedians are most likely to grumble about gender balancing that reduces their personal chances, either through a gender balanced recruitment of staff from a predominately male pool of volunteers, or gender balanced trips to Wikimania from that same predominately male pool of volunteers. But gender balancing by author of purchase of reference books, that doesn't disadvantage many Wikipedians. Though a skew towards more academic topics and away from military history might get a few grumbles.
WSC
On Fri, 23 Aug 2019 at 08:01, Greg thenatureprogram@gmail.com wrote:
Wow, Kerry! Thank you for taking the time to write all these thoughts out.
I'm asking the question because I'm concerned that the gender balance of the authors being cited on wikipedia is different from the already quite bad patterns in academia. My fear is that the citation gender imbalance on Wikipedia is more pronounced. If so, it is not just perpetuating the problem, but making it worse by surfacing certain authors and ideas even more frequently, or hardly at all. I would like to know if this is the case, and if so, how big the effect is.
In my last message, I mention a study about a set of award-winning political science books (the researchers study the citation gender imbalance for that set). I just saw this study today, but I began to think that it/the set of works--or some similar set of titles--could possibly be a good place to begin, especially if the original researchers were willing to share the list of titles/authors/gender/etc that they put together/worked with. Then it seems it would mostly be a matter of figuring out how to understand how those titles are cited on Wikipedia--through either the citation dataset or wikicite--to see if/how the citation patterns differ (i.e., if the works by women/men are cited more frequently/at the same rate/less frequently on Wikipedia than what the researchers found in the original study).
This seems like it would be easier to do than what you propose, but perhaps the idea is not sound. Until very recently, I thought I could find the answer in an existing paper! I honestly don't know the best way to get the answer, but I would like to know the answer and think it's important to look at.
All of the things you bring up--from the gender of the editor, to the type of editing being done, to the issues around multiple authors/paywalls/year of publication/field--complicate the inquiry, and in particular a larger one. I agree with what you say about doing something small first to see what's there.
Thanks again for all your thoughts. Greg
On Thu, Aug 22, 2019 at 9:41 PM < wiki-research-l-request@lists.wikimedia.org> wrote:
Send Wiki-research-l mailing list submissions to wiki-research-l@lists.wikimedia.org
To subscribe or unsubscribe via the World Wide Web, visit https://lists.wikimedia.org/mailman/listinfo/wiki-research-l or, via email, send a message with subject or body 'help' to wiki-research-l-request@lists.wikimedia.org
You can reach the person managing the list at wiki-research-l-owner@lists.wikimedia.org
When replying, please edit your Subject line so it is more specific than "Re: Contents of Wiki-research-l digest..."
Today's Topics:
- Re: gender balance of wikipedia citations (Greg)
- Re: gender balance of wikipedia citations (Kerry Raymond)
Message: 1 Date: Thu, 22 Aug 2019 18:47:48 -0700 From: Greg thenatureprogram@gmail.com To: wiki-research-l@lists.wikimedia.org Subject: Re: [Wiki-research-l] gender balance of wikipedia citations Message-ID: < CAOO9DNvBrw_aLkRUp5kYFLdaLJUEK+ddiz-A09MZwiotAdAmUw@mail.gmail.com> Content-Type: text/plain; charset="UTF-8"
Hi Leila,
Thanks for your thoughts.
Having just read Troy Vettese's very powerful essay, Sexism in the
Academy
( https://nplusonemag.com/issue-34/essays/sexism-in-the-academy/), I wish this were a top priority.
I stumbled upon a study today--it came up in the Washington Post's excellent series on gender bias in political science. The authors look
at a
set of award winning political science books and the gender imbalance in the citations drawn from google scholar. I'm linking the piece here in case anyone on this list is interested now, or in the future, in how the patterns on Wikipedia compare.
Washington Post piece: "There’s a gender gap in who wins political
science
book awards – and in how widely they’re cited"
https://www.washingtonpost.com/politics/2019/08/22/theres-gender-gap-who-win...
"Just as significantly, women’s award-winning books receive fewer
scholarly
citations than men’s award-winning volumes — and this disparity has
grown,
rather than shrunk, in recent years. Over the entire period, APSA award-winning volumes by women averaged 43 percent fewer citations per
year
than those by male authors."
Paper: "Winning awards and gaining recognition: An impact analysis of
APSA
section book prizes" https://www.sciencedirect.com/science/article/abs/pii/S0362331918300867
Best, Greg
On Thu, Aug 22, 2019 at 3:44 PM < wiki-research-l-request@lists.wikimedia.org> wrote:
Send Wiki-research-l mailing list submissions to wiki-research-l@lists.wikimedia.org
To subscribe or unsubscribe via the World Wide Web, visit https://lists.wikimedia.org/mailman/listinfo/wiki-research-l or, via email, send a message with subject or body 'help' to wiki-research-l-request@lists.wikimedia.org
You can reach the person managing the list at wiki-research-l-owner@lists.wikimedia.org
When replying, please edit your Subject line so it is more specific than "Re: Contents of Wiki-research-l digest..."
Today's Topics:
- Re: gender balance of wikipedia citations (Greg)
- Re: gender balance of wikipedia citations (Leila Zia)
- Wikimania 2019 disinformation meetup follow-up (Leila Zia)
- Upcoming Research Newsletter (special issue on gender gap research): New papers open for review (Mohammed Sadat Abdulai)
Message: 1 Date: Thu, 22 Aug 2019 09:57:15 -0700 From: Greg thenatureprogram@gmail.com To: wiki-research-l@lists.wikimedia.org Subject: Re: [Wiki-research-l] gender balance of wikipedia citations Message-ID: <CAOO9DNuSYzzaVwcdqiWA7pj671z3N43XOSwv6DtW0SxWg= L8GQ@mail.gmail.com> Content-Type: text/plain; charset="UTF-8"
Hi Kerry, Those are all very interesting ways to look at this. I was thinking
mostly
along the lines of your first bullet point, but I'd be interested in research in any of those areas.
Thanks, Greg
On Thu, Aug 22, 2019 at 5:00 AM < wiki-research-l-request@lists.wikimedia.org> wrote:
Send Wiki-research-l mailing list submissions to wiki-research-l@lists.wikimedia.org
To subscribe or unsubscribe via the World Wide Web, visit https://lists.wikimedia.org/mailman/listinfo/wiki-research-l or, via email, send a message with subject or body 'help' to wiki-research-l-request@lists.wikimedia.org
You can reach the person managing the list at wiki-research-l-owner@lists.wikimedia.org
When replying, please edit your Subject line so it is more specific than "Re: Contents of Wiki-research-l digest..."
Today's Topics:
- gender balance of wikipedia citations (Greg)
- Re: gender balance of wikipedia citations (Kerry Raymond)
Message: 1 Date: Wed, 21 Aug 2019 20:19:18 -0700 From: Greg thenatureprogram@gmail.com To: wiki-research-l@lists.wikimedia.org Subject: [Wiki-research-l] gender balance of wikipedia citations Message-ID: < CAOO9DNtY+oDO5oQrMZeG1NZE-kYNYLWnTD6acHeYTbYeGk8k2Q@mail.gmail.com> Content-Type: text/plain; charset="UTF-8"
Greetings!
I was looking for information about the gender balance of Wikipedia citations and no one I've asked knows of any work on this topic. Do
you?
I think this is an important question.
Here's what I've learned so far:
Wikipedia citations are currently in the form of text strings. There
is
also an initiative to place citations in an annotated structured
repository
(wikicite). I do not know the current status of wikicite or if/when
this
could be used for this inquiry--either to examine all, or a sensible
subset
of the citations.
My perspective is that understanding the gender balance is necessary
and
urgent. The balance could be better, the same, or worse than the
citation
balances we already know, and the scale of the effect is quite large.
Is this a line of inquiry that the wikimedia/wikicite community is interested in pursuing? If so, what is the best way to get started?
Does
the WMF have the resources and interest to look into this matter
inhouse?
Thanks for your thoughts.
Greg
Message: 2 Date: Thu, 22 Aug 2019 13:53:45 +1000 From: "Kerry Raymond" kerry.raymond@gmail.com To: "'Research into Wikimedia content and communities'" wiki-research-l@lists.wikimedia.org Subject: Re: [Wiki-research-l] gender balance of wikipedia citations Message-ID: 00ed01d5589d$33e31ed0$9ba95c70$@gmail.com Content-Type: text/plain; charset="UTF-8"
Could you elaborate a bit more on what you mean by the gender balance
of
citations?
Are you talking about:
- proportion of male vs female authors of the source material used as
citations in arbitrary articles>
- the quality/quantity of citations in biography articles of men vs
women?
- the quality/quantity of citations in articles that are gendered by
some
other criteria (e.g. reader interest, romantic comedy vs action
film)?
Kerry
-----Original Message----- From: Wiki-research-l [mailto:
wiki-research-l-bounces@lists.wikimedia.org]
On Behalf Of Greg Sent: Thursday, 22 August 2019 1:19 PM To: wiki-research-l@lists.wikimedia.org Subject: [Wiki-research-l] gender balance of wikipedia citations
Greetings!
I was looking for information about the gender balance of Wikipedia citations and no one I've asked knows of any work on this topic. Do
you?
I think this is an important question.
Here's what I've learned so far:
Wikipedia citations are currently in the form of text strings. There
is
also an initiative to place citations in an annotated structured
repository
(wikicite). I do not know the current status of wikicite or if/when
this
could be used for this inquiry--either to examine all, or a sensible
subset
of the citations.
My perspective is that understanding the gender balance is necessary
and
urgent. The balance could be better, the same, or worse than the
citation
balances we already know, and the scale of the effect is quite large.
Is this a line of inquiry that the wikimedia/wikicite community is interested in pursuing? If so, what is the best way to get started?
Does
the WMF have the resources and interest to look into this matter
inhouse?
Thanks for your thoughts.
Greg _______________________________________________ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Subject: Digest Footer
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
End of Wiki-research-l Digest, Vol 168, Issue 11
Message: 2 Date: Thu, 22 Aug 2019 10:43:51 -0700 From: Leila Zia leila@wikimedia.org To: Research into Wikimedia content and communities wiki-research-l@lists.wikimedia.org Subject: Re: [Wiki-research-l] gender balance of wikipedia citations Message-ID: <CAK0Oe2uCo70_=ma2b=2d+fvr4GseEVxOP0sh= ELNOpKdCuUfqA@mail.gmail.com> Content-Type: text/plain; charset="UTF-8"
Hi Greg,
A few comments if you're going to go with "proportion of male vs female authors of the source material used as citations in arbitrary articles":
- Please differentiate between sex (female, male, ...) and gender
(woman, man, ...). My understanding from your initial email is that you want to stay focused on gender, not sex.
- Unless you have reliable sources about the gender of an author, I
would not recommend trying to predict what the gender is. (As you may know, this is not uncommon in social media studies, for example, to predict the gender of the author based on their image or their name. These approaches introduce biases and social challenges.)
- Re your question about whether WMF has resources to look into this
question in-house: I can't speak for the whole of WMF, however, I can share more about the Research team's direction. As part of our future work, we would like to "help contributors monitor violations of core content policies and assess information reliability and bias both granularly and at scale". [1] The question you proposed can fall under assessing bias in content (considering citations as part of the content). I expect us to focus first on the piece about violations of core content policies and information reliability and come back to the bias question later. As a result, we won't have bandwidth to do your proposal in-house at the moment. Sorry about that.
I hope this helps.
Best, Leila
[1] Section 2 of our Knowledge Integrity whitepaper:
https://upload.wikimedia.org/wikipedia/commons/9/9a/Knowledge_Integrity_-_Wi...
On Thu, Aug 22, 2019 at 9:57 AM Greg thenatureprogram@gmail.com
wrote:
Hi Kerry, Those are all very interesting ways to look at this. I was thinking
mostly
along the lines of your first bullet point, but I'd be interested in research in any of those areas.
Thanks, Greg
On Thu, Aug 22, 2019 at 5:00 AM <
wiki-research-l-request@lists.wikimedia.org>
wrote:
Send Wiki-research-l mailing list submissions to wiki-research-l@lists.wikimedia.org
To subscribe or unsubscribe via the World Wide Web, visit
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
or, via email, send a message with subject or body 'help' to wiki-research-l-request@lists.wikimedia.org
You can reach the person managing the list at wiki-research-l-owner@lists.wikimedia.org
When replying, please edit your Subject line so it is more specific than "Re: Contents of Wiki-research-l digest..."
Today's Topics:
- gender balance of wikipedia citations (Greg)
- Re: gender balance of wikipedia citations (Kerry Raymond)
Message: 1 Date: Wed, 21 Aug 2019 20:19:18 -0700 From: Greg thenatureprogram@gmail.com To: wiki-research-l@lists.wikimedia.org Subject: [Wiki-research-l] gender balance of wikipedia citations Message-ID: < CAOO9DNtY+oDO5oQrMZeG1NZE-kYNYLWnTD6acHeYTbYeGk8k2Q@mail.gmail.com
Content-Type: text/plain; charset="UTF-8"
Greetings!
I was looking for information about the gender balance of Wikipedia citations and no one I've asked knows of any work on this topic. Do
you?
I think this is an important question.
Here's what I've learned so far:
Wikipedia citations are currently in the form of text strings.
There
is
also an initiative to place citations in an annotated structured
repository
(wikicite). I do not know the current status of wikicite or if/when
this
could be used for this inquiry--either to examine all, or a
sensible
subset
of the citations.
My perspective is that understanding the gender balance is
necessary
and
urgent. The balance could be better, the same, or worse than the
citation
balances we already know, and the scale of the effect is quite
large.
Is this a line of inquiry that the wikimedia/wikicite community is interested in pursuing? If so, what is the best way to get started?
Does
the WMF have the resources and interest to look into this matter
inhouse?
Thanks for your thoughts.
Greg
Message: 2 Date: Thu, 22 Aug 2019 13:53:45 +1000 From: "Kerry Raymond" kerry.raymond@gmail.com To: "'Research into Wikimedia content and communities'" wiki-research-l@lists.wikimedia.org Subject: Re: [Wiki-research-l] gender balance of wikipedia
citations
Message-ID: 00ed01d5589d$33e31ed0$9ba95c70$@gmail.com Content-Type: text/plain; charset="UTF-8"
Could you elaborate a bit more on what you mean by the gender
balance
of
citations?
Are you talking about:
- proportion of male vs female authors of the source material used
as
citations in arbitrary articles>
- the quality/quantity of citations in biography articles of men
vs
women?
- the quality/quantity of citations in articles that are gendered
by
some
other criteria (e.g. reader interest, romantic comedy vs action
film)?
Kerry
-----Original Message----- From: Wiki-research-l [mailto:
wiki-research-l-bounces@lists.wikimedia.org]
On Behalf Of Greg Sent: Thursday, 22 August 2019 1:19 PM To: wiki-research-l@lists.wikimedia.org Subject: [Wiki-research-l] gender balance of wikipedia citations
Greetings!
I was looking for information about the gender balance of Wikipedia citations and no one I've asked knows of any work on this topic. Do
you?
I think this is an important question.
Here's what I've learned so far:
Wikipedia citations are currently in the form of text strings.
There
is
also an initiative to place citations in an annotated structured
repository
(wikicite). I do not know the current status of wikicite or if/when
this
could be used for this inquiry--either to examine all, or a
sensible
subset
of the citations.
My perspective is that understanding the gender balance is
necessary
and
urgent. The balance could be better, the same, or worse than the
citation
balances we already know, and the scale of the effect is quite
large.
Is this a line of inquiry that the wikimedia/wikicite community is interested in pursuing? If so, what is the best way to get started?
Does
the WMF have the resources and interest to look into this matter
inhouse?
Thanks for your thoughts.
Greg _______________________________________________ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Subject: Digest Footer
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
End of Wiki-research-l Digest, Vol 168, Issue 11
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Message: 3 Date: Thu, 22 Aug 2019 13:36:17 -0700 From: Leila Zia leila@wikimedia.org To: Research into Wikimedia content and communities wiki-research-l@lists.wikimedia.org Subject: [Wiki-research-l] Wikimania 2019 disinformation meetup follow-up Message-ID: <CAK0Oe2sodYJpkuhSqgo3dtfDr= NQ5EK1TdH16F6BOkTyFho9Rg@mail.gmail.com> Content-Type: text/plain; charset="UTF-8"
Hi,
This message is for those of you who attended the disinformation meet-up [0] in Wikimania 2019 [1] or others who may be interested.
- The notes from our meet-up are now posted in the bottom of the page
[0].
- I was tasked to see if space.wmflabs.org is the place for us to
continue conversations about this topic. The answer is yes. Thanks to the help of Elena Lappen, we now have a dedicated subcategory for disinformation: https://discuss-space.wmflabs.org/c/research/disinformation . Feel free to subscribe, watch, and/or post new topics if you're involved in this space.
- If you are new to this conversation, please read the purpose of the
subcategory at
https://discuss-space.wmflabs.org/t/about-the-disinformation-category/949
and welcome! :)
Best, Leila
[0] https://wikimania.wikimedia.org/wiki/2019:Meetups/Disinformation [1] https://wikimania.wikimedia.org/wiki/2019:Program
Message: 4 Date: Thu, 22 Aug 2019 22:43:53 +0000 (UTC) From: Mohammed Sadat Abdulai masssly@ymail.com To: Research Into Wikimedia Content and Communities wiki-research-l@lists.wikimedia.org Subject: [Wiki-research-l] Upcoming Research Newsletter (special issue on gender gap research): New papers open for review Message-ID: 1625269943.668598.1566513833343@mail.yahoo.com Content-Type: text/plain; charset=UTF-8
Hi everyone, We’re preparing for the August 2019 research newsletter and looking for contributors. Please take a look at https://etherpad.wikimedia.org/p/WRN201908 and add your name next to
any
paper you are interested in covering. Our target publication date is on
31
August 11:59 UTC. As usual, short notes and one-paragraph reviews are
most
welcome. For the August edition, we are planning a special issue focusing
mainly
on recent gender gap/gender bias research. (Upcoming special issues
topics
may include health and education.) There are about 20 papers from this
area
on our todo list which will all be covered in the August issue, either
as a
mere list item or - with your help - in form of a more informative
writeup
or review. They include:
Analyzing Gender Stereotyping in Bollywood Movies
Breaking the glass ceiling on Wikipedia| journal
Breastfeeding, Authority, and Genre: Women's Ethos in Wikipedia
and
Blogs
- Cyberfeminism on Wikipedia: Visibility and deliberation in
feminist
Wikiprojects
Gender and deletion on Wikipedia
Gender imbalance and Wikipedia
Gender Markers in Wikipedia Usernames
How do students trust Wikipedia? An examination across genders
Investigating the Gender Pronoun Gap in Wikipedia
It’s Not What You Think: Gender Bias in Information about Fortune
1000 CEOs on Wikipedia
- Mapping and Bridging the Gender Gap: An Ethnographic Study of
Indian
Wikipedians and Their Motivations to Contribute
- People Who Can Take It: How Women Wikipedians Negotiate and
Navigate
Safety
Redressing Gender Inequities on Wikipedia Through an Editathon
Similar Gaps, Different Origins? Women Readers and Editors at
Greek
Wikipedia
- Simulation Experiments on (the Absence of) Ratings Bias in
Reputation
Systems
The Gendered Presentation of Professions on Wikipedia
Who Counts as a Notable Sociologist on Wikipedia? Gender, Race,
and
the “Professor Test”
- Who Wants to Read This?: A Method for Measuring Topical
Representativeness in User Generated Content Systems
- Women and Wikipedia. Diversifying Editors and Enhancing Content
through Library Edit-a-Thons
Masssly and Tilman Bayer
[1] Research:Newsletter - Meta[2] WikiResearch (@WikiResearch) on
Subject: Digest Footer
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
End of Wiki-research-l Digest, Vol 168, Issue 12
Message: 2 Date: Fri, 23 Aug 2019 14:41:09 +1000 From: "Kerry Raymond" kerry.raymond@gmail.com To: "'Research into Wikimedia content and communities'" wiki-research-l@lists.wikimedia.org Subject: Re: [Wiki-research-l] gender balance of wikipedia citations Message-ID: 001001d5596c$fe22a100$fa67e300$@gmail.com Content-Type: text/plain; charset="utf-8"
Yes, that was my thought. It would be difficult to know the sex (or the gender) of an author name on a paper. There would inevitably be a lot
that
you could not determine. And certainly in the sciences multi-author pages are the norm and even where you did know the sex/gender of all, do you assign some part-score? E.g. 0 for all male, 1 for all female, 0.6 for 3 women and 2 men.
But I am curious why you are asking the question? That the writing/research of women is under-represented in Wikipedia citations? If so, without conducting any research, I'd say "yes it is
under-represented".
But my reason would be because women are under-represented as writers/researchers in the first place. And certainly the older the source, the more likely it is to be written by a man. So to investigate gender bias in citations in Wikipedia, you would have to estimate the proportion of men/women (or at least their outputs) over time in a given discipline and then ask the question, "taking into account of the time of publication of a citation and the proportion of men/women active in this discipline at that time, do Wikipedia citations show a sex/gender
basis?".
Hmm ... very tricky.
I'd be inclined to suggest starting with a much simpler task. Pick a discipline (preferably one with a professional society who can tell your their estimate of current male/female ratio over (say) the past 5 years), limit the Wikipedia articles to topics in that discipline, and limit the citations to those published within the last 5 years. Indeed, perhaps limiting it to publications that are principally from the same country(s) as the professional society from which you get the data (as clearly men/women's participation in any discipline can vary with different countries for cultural reasons). Then you have some way to gauge whether Wikipedia is showing more or less gender bias in its citations than the discipline itself exhibits through publication. Quite a challenge!
And of course, it is not Wikipedia that adds citations. It is individual contributor who add citations. Does the sex/gender of the contributor
have
any correlation to any observed bias? Again, the task is made more difficult because a lot of Wikipedians don't identify their sex/gender.
The other thing to be alert to is the difference in how (I believe) Wikipedians cite compared to researchers. As a researcher, I will of
course
be reading papers in my field all the time and what I read will influence my subsequent work. Therefore when I write about my research, my
citations
are referring to papers that I have already read and whose authors may be familiar to me from their other work, having met them at a conferences, private correspondence, etc. However as a Wikipedian, I am only partially operating that way (mostly when I write new articles or significantly expand them, that is, when I am doing the research). A lot of the time I
am
adding citations relating to content other people (often new users) have added/changed without citation. These come up on my watchlist all the
time.
What do I do? Of course I could revert saying "no citation provided", but that's not the way to encourage new contributors nor to grow the encyclopedia, so if the information seems plausible (not obviously vandalism), I will attempt to find a citation for it (using tools like Google and other topic-specialise search tools). This is what I call
"lucky
dip" mode of citing as obviously I have no idea what the source was for
the
original contributor. The sources I find from my search may not already
be
known to me (frequently they are not). Or to summarise, IMHO, researchers (or Wikipedians in "new content mode") cite a source already known to
them
and whose authors may be known to them and could consciously or unconsciously engage in some discrimination in citation based on
sex/gender
or other criteria, whereas Wikipedians in "updating mode" are likely to
be
citing a source not previously known to them and may be happy just to
have
found a source and are unlikely to be spending a lot of their time researching the authors of that source to be extent they could then consciously or unconsciously exercise discrimination on sex/gender. If I invest any extra effort in such a situations, it's probably because the wording of the source is a close match to the Wikipedia article which
begs
the question of copyright violation (which needs to be dealt with by deletion or rewriting) or being a Wikipedia mirror (which is obviously
not
an acceptable citation).
So I suspect whether a citation was added by the same contributor as the content it supports or a subsequent contributor probably makes a
difference
to the likelihood of conscious/unconscious discrimination.
Also, finally, often Wikipedia cites web pages and other sources that do not have any individual authorship, e.g. government websites. Remember
that
Wikipedia prefers open citations over paywalled citations and a lot of
the
publications behind paywalls are individually authored.
Your proposed research has a lot of interesting challenges and a number
of
limitations. I'm not saying don't do it, but I am saying start very small and see if you can find any evidence to support your hypothesis before embarking on a larger study. Because contributor behaviour is what you
are
trying to study, you probably need to do both quantitative and
qualitative
experiments. E.g. I have described the two modes of citation I do, but I cannot say how typical my behaviour is.
Kerry
-----Original Message----- From: Wiki-research-l [mailto:
wiki-research-l-bounces@lists.wikimedia.org]
On Behalf Of Leila Zia Sent: Friday, 23 August 2019 3:44 AM To: Research into Wikimedia content and communities < wiki-research-l@lists.wikimedia.org> Subject: Re: [Wiki-research-l] gender balance of wikipedia citations
Hi Greg,
A few comments if you're going to go with "proportion of male vs female authors of the source material used as citations in arbitrary articles":
- Please differentiate between sex (female, male, ...) and gender (woman,
man, ...). My understanding from your initial email is that you want to stay focused on gender, not sex.
- Unless you have reliable sources about the gender of an author, I would
not recommend trying to predict what the gender is. (As you may know,
this
is not uncommon in social media studies, for example, to predict the
gender
of the author based on their image or their name. These approaches introduce biases and social challenges.)
- Re your question about whether WMF has resources to look into this
question in-house: I can't speak for the whole of WMF, however, I can
share
more about the Research team's direction. As part of our future work, we would like to "help contributors monitor violations of core content policies and assess information reliability and bias both granularly and
at
scale". [1] The question you proposed can fall under assessing bias in content (considering citations as part of the content). I expect us to focus first on the piece about violations of core content policies and information reliability and come back to the bias question later. As a result, we won't have bandwidth to do your proposal in-house at the
moment.
Sorry about that.
I hope this helps.
Best, Leila
[1] Section 2 of our Knowledge Integrity whitepaper:
https://upload.wikimedia.org/wikipedia/commons/9/9a/Knowledge_Integrity_-_Wi...
On Thu, Aug 22, 2019 at 9:57 AM Greg thenatureprogram@gmail.com wrote:
Hi Kerry, Those are all very interesting ways to look at this. I was thinking mostly along the lines of your first bullet point, but I'd be interested in research in any of those areas.
Thanks, Greg
On Thu, Aug 22, 2019 at 5:00 AM wiki-research-l-request@lists.wikimedia.org wrote:
Send Wiki-research-l mailing list submissions to wiki-research-l@lists.wikimedia.org
To subscribe or unsubscribe via the World Wide Web, visit https://lists.wikimedia.org/mailman/listinfo/wiki-research-l or, via email, send a message with subject or body 'help' to wiki-research-l-request@lists.wikimedia.org
You can reach the person managing the list at wiki-research-l-owner@lists.wikimedia.org
When replying, please edit your Subject line so it is more specific than "Re: Contents of Wiki-research-l digest..."
Today's Topics:
- gender balance of wikipedia citations (Greg)
- Re: gender balance of wikipedia citations (Kerry Raymond)
--
Message: 1 Date: Wed, 21 Aug 2019 20:19:18 -0700 From: Greg thenatureprogram@gmail.com To: wiki-research-l@lists.wikimedia.org Subject: [Wiki-research-l] gender balance of wikipedia citations Message-ID: < CAOO9DNtY+oDO5oQrMZeG1NZE-kYNYLWnTD6acHeYTbYeGk8k2Q@mail.gmail.com> Content-Type: text/plain; charset="UTF-8"
Greetings!
I was looking for information about the gender balance of Wikipedia citations and no one I've asked knows of any work on this topic. Do
you?
I think this is an important question.
Here's what I've learned so far:
Wikipedia citations are currently in the form of text strings. There is also an initiative to place citations in an annotated structured repository (wikicite). I do not know the current status of wikicite or if/when this could be used for this inquiry--either to examine all, or a sensible subset of the citations.
My perspective is that understanding the gender balance is necessary and urgent. The balance could be better, the same, or worse than the citation balances we already know, and the scale of
the
effect is quite large.
Is this a line of inquiry that the wikimedia/wikicite community is interested in pursuing? If so, what is the best way to get started? Does the WMF have the resources and interest to look into this matter
inhouse?
Thanks for your thoughts.
Greg
Message: 2 Date: Thu, 22 Aug 2019 13:53:45 +1000 From: "Kerry Raymond" kerry.raymond@gmail.com To: "'Research into Wikimedia content and communities'" wiki-research-l@lists.wikimedia.org Subject: Re: [Wiki-research-l] gender balance of wikipedia citations Message-ID: 00ed01d5589d$33e31ed0$9ba95c70$@gmail.com Content-Type: text/plain; charset="UTF-8"
Could you elaborate a bit more on what you mean by the gender balance of citations?
Are you talking about:
- proportion of male vs female authors of the source material used
as citations in arbitrary articles>
- the quality/quantity of citations in biography articles of men vs
women?
- the quality/quantity of citations in articles that are gendered by
some other criteria (e.g. reader interest, romantic comedy vs action
film)?
Kerry
-----Original Message----- From: Wiki-research-l [mailto:wiki-research-l-bounces@lists.wikimedia.org] On Behalf Of Greg Sent: Thursday, 22 August 2019 1:19 PM To: wiki-research-l@lists.wikimedia.org Subject: [Wiki-research-l] gender balance of wikipedia citations
Greetings!
I was looking for information about the gender balance of Wikipedia citations and no one I've asked knows of any work on this topic. Do
you?
I think this is an important question.
Here's what I've learned so far:
Wikipedia citations are currently in the form of text strings. There is also an initiative to place citations in an annotated structured repository (wikicite). I do not know the current status of wikicite or if/when this could be used for this inquiry--either to examine all, or a sensible subset of the citations.
My perspective is that understanding the gender balance is necessary and urgent. The balance could be better, the same, or worse than the citation balances we already know, and the scale of
the
effect is quite large.
Is this a line of inquiry that the wikimedia/wikicite community is interested in pursuing? If so, what is the best way to get started? Does the WMF have the resources and interest to look into this matter
inhouse?
Thanks for your thoughts.
Greg _______________________________________________ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Subject: Digest Footer
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
End of Wiki-research-l Digest, Vol 168, Issue 11
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Subject: Digest Footer
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
End of Wiki-research-l Digest, Vol 168, Issue 13
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
WereSpielChequers, 02/09/19 17:10:
If you can come up with software that identifies sources that we aren't using but should then that would make for some interesting reports on Wikiprojects, or an interesting opportunity for the wiki library.
Good point. This also came up at a research meetup at Wikimania, where the question was what corpora could be used (one proposal is the Internet Archive): see notes at https://wikimania.wikimedia.org/wiki/2019:Meetups/Affiliates_and_Research
A related proposal was https://github.com/eggpi/citationhunt/issues/137
I'd particularly like to see something along the lines of a bot that sends messages to active Wikipedia editors "As an editor who has been active in topic zzzzzz we would like to send you a free copy of the new book xxxxxx by yyyyy. Click here to arrange your free copy"
I would be interested in helping you do this (for Italian-language editors?). I think it can easily be written in a way that doesn't sound like a promotion for the specific item.
Federico
Hi WSC,
That is an interesting idea, and the question of the corpus to draw from (or build) is key. In my own work, I have used the following techniques:
- direct solicitations from domain experts. I will express my interest in consulting a diverse range of perspectives and will ask for book titles and/or the names of people from underrepresented demographics who are doing work in the field.
- name-database. I start from a list of underrepresented domain experts (I have found some such things on wikipedia itself as well as places like 500womenscientists) and I will look to see if any of the people have books on the topic I'm researching.
For now, my interest remains focused on understanding the current gender balance of the citations on Wikipedia. To this end, I am still very interested in hearing from someone with knowledge of wikidata about how best to use it and any limitations I should be aware of.
Thanks, Greg
On Mon, Sep 2, 2019 at 10:17 AM Federico Leva (Nemo) nemowiki@gmail.com wrote:
WereSpielChequers, 02/09/19 17:10:
If you can come up with software that identifies sources that we aren't using but should then that would make
for
some interesting reports on Wikiprojects, or an interesting opportunity
for
the wiki library.
Good point. This also came up at a research meetup at Wikimania, where the question was what corpora could be used (one proposal is the Internet Archive): see notes at https://wikimania.wikimedia.org/wiki/2019:Meetups/Affiliates_and_Research
A related proposal was https://github.com/eggpi/citationhunt/issues/137
I'd particularly like to see something along the lines of a bot that
sends
messages to active Wikipedia editors "As an editor who has been active in topic zzzzzz we would like to send you a free copy of the new book xxxxxx by yyyyy. Click here to arrange your free copy"
I would be interested in helping you do this (for Italian-language editors?). I think it can easily be written in a way that doesn't sound like a promotion for the specific item.
Federico
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
wiki-research-l@lists.wikimedia.org