Dear all,
I thank you for your efforts. I am honoured to inform you that our latest research paper about Wikidata and Health has been published in Journal of Biomedical Informatics (IF=2.9). The paper is available at https://doi.org/10.1016/j.jbi.2019.103292. This paper is the first one of our series of research publications about Medical Wikidata project. If you like to have the paper, please contact me and I will send the PDF to you. Our next papers will work on finding methods to ameliorate the coverage and quality of medical information in Wikidata. We are quite sure that our work is important as it will be provide trustworthy reference medical information that can be used by physicians and computer programs to process medical data and enhance the efficiency of health care.
Yours Sincerely,
Houcemeddine Turki (he/him)
Medical Student, Faculty of Medicine of Sfax, University of Sfax, Tunisia
GLAM, Research and Education Coordinator, Wikimedia TN User Group
Member, Wiki Project Med
Member, WikiIndaba Steering Committee
Member, Wikimedia and Library User Group Steering Committee
____________________
+21629499418
Hi everyone,
We’re preparing for the September 2019 research newsletter and looking for contributors. Please take a look at https://etherpad.wikimedia.org/p/WRN201909 and add your name next to any paper you are interested in covering. Our target publication date is 30 September 11:59 UTC. As usual, short notes and one-paragraph reviews are most welcome.
Highlights from this month:
- DBpedia FlexiFusion: The Best of Wikipedia > Wikidata > Your Data
- Improving Neural Question Generation using World Knowledge
- Introduction to Neural Network based Approaches for Question Answering over Knowledge Graphs
- ORES: Lowering Barriers with Participatory Machine Learning in Wikipedia
- The Global Popularity of William Shakespeare in 303 Wikipedias
- The use of collaborative open-access publishing via Wikipedia in university education to embed digital citizenship skills
- Wikidata from a Research Perspective -- A Systematic Mapping Study of Wikidata
- Wikipedia as complementary formative assessment method in University Courses
Masssly and Tilman Bayer
[1] http://meta.wikimedia.org/wiki/Research:Newsletter[2]https://twitter.com/WikiResearch
Dear Sir or Madame,
On behalf of Emilio Zagheni, I would like to draw the attention to several academic job openings at the Laboratory of Digital and Computational Demography at the Max Planck Institute for Demographic Research. The details of the job vacancy and how to apply can be found on their respective websites:
For W2 Research Faculty Position (equivalent to Associate Professor) in the Lab of Digital and Computational Demography
https://www.demogr.mpg.de/en/education_career/jobs_fellowships_1910/w2_rese…
For the Post-Docs/Research Scientists:
https://www.demogr.mpg.de/en/education_career/jobs_fellowships_1910/postdoc…
For the PhD student positions:
https://www.demogr.mpg.de/en/education_career/jobs_fellowships_1910/phd_stu…
For Summer Research Visits:
https://www.demogr.mpg.de/en/education_career/jobs_fellowships_1910/summer_…
Please kindly distribute these vacancies among your fellow researchers and students at your institution.
Data protection notice:
We use your data exclusively to inform you about current news from the MPIDR. Please use the following contact to obtain information on personal data stored about you or to have the data changed at any time:
career(a)demogr.mpg.de<mailto:career@demogr.mpg.de>
Should you no longer wish to receive news from the MPIDR, please click the following link:
Unsubscribe from MPIDR news distribution list<mailto:career@demogr.mpg.de?subject=Unsubscribe&body=I%20would%20like%20to%20unsubscribe%20from%20MPIDR%20news.>
Information on data protection can be accessed at any time on the website of the Max Planck Institute for Demographic Research (https://www.demogr.mpg.de/en/privacy_policy_5725/default.htm).
Thanks so much!
With best wishes,
Antje Gosselck
Max Planck Institute for Demographic Research
Konrad-Zuse-Str. 1
D-18057 Rostock
Germany
http://www.demogr.mpg.de
mailto:gosselck@demogr.mpg.de
Tel. +49 (0) 381 / 2081 108
Fax +49 (0) 381 / 2081 408
----------
This mail has been sent through the MPI for Demographic Research. Should you receive a mail that is apparently from a MPI user without this text displayed, then the address has most likely been faked. If you are uncertain about the validity of this message, please check the mail header or ask your system administrator for assistance.
Hello everyone,
The next Research Showcase will be live-streamed next Wednesday, September
18, at 9:30 AM PT/16:30 UTC. This will be the new time going forward for
Research Showcases in order to give more access to other timezones.
YouTube stream: https://www.youtube.com/watch?v=fDhAnHrkBks
As usual, you can join the conversation on IRC at #wikimedia-research. You
can also watch our past research showcases here:
https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase
This month's presentations:
Citation Needed: A Taxonomy and Algorithmic Assessment of Wikipedia's
Verifiability
By Miriam Redi, Research, Wikimedia Foundation
Among Wikipedia's core guiding principles, verifiability policies have a
particularly important role. Verifiability requires that information
included in a Wikipedia article be corroborated against reliable secondary
sources. Because of the manual labor needed to curate and fact-check
Wikipedia at scale, however, its contents do not always evenly comply with
these policies. Citations (i.e. reference to external sources) may not
conform to verifiability requirements or may be missing altogether,
potentially weakening the reliability of specific topic areas of the free
encyclopedia. In this project
<https://meta.wikimedia.org/wiki/Research:Identification_of_Unsourced_Statem…>,
we aimed to provide an empirical characterization of the reasons why and
how Wikipedia cites external sources to comply with its own verifiability
guidelines. First, we constructed a taxonomy of reasons why inline
citations are required by collecting labeled data from editors of multiple
Wikipedia language editions. We then collected a large-scale crowdsourced
dataset of Wikipedia sentences annotated with categories derived from this
taxonomy. Finally, we designed and evaluated algorithmic models to
determine if a statement requires a citation, and to predict the citation
reason based on our taxonomy. We evaluated the robustness of such models
across different classes of Wikipedia articles of varying quality, as well
as on an additional dataset of claims annotated for fact-checking purposes.
Redi, M., Fetahu, B., Morgan, J., & Taraborelli, D. (2019, May). Citation
Needed: A Taxonomy and Algorithmic Assessment of Wikipedia's Verifiability.
In The World Wide Web Conference (pp. 1567-1578). ACM.
https://arxiv.org/abs/1902.11116
Patrolling on Wikipedia
By Jonathan T. Morgan, Research, Wikimedia Foundation
I will present initial findings from an ongoing research study
<https://meta.wikimedia.org/wiki/Research:Patrolling_on_Wikipedia> of
patrolling workflows on Wikimedia projects. Editors patrol recent pages and
edits to ensure that Wikimedia projects maintains high quality as new
content comes in. Patrollers revert vandalism and review newly-created
articles and article drafts. Patrolling of new pages and edits is vital
work. In addition to making sure that new content conforms to Wikipedia
project policies, patrollers are the first line of defense against
disinformation, copyright infringement, libel and slander, personal
threats, and other forms of vandalism on Wikimedia projects. This research
project is focused on understanding the needs, priorities, and workflows of
editors who patrol new content on Wikimedia projects. The findings of this
research can inform the development of better patrolling tools as well as
non-technological interventions intended to support patrollers and the
activity of patrolling.
--
Janna Layton (she, her)
Administrative Assistant - Product & Technology
Wikimedia Foundation <https://wikimediafoundation.org/>
Down with Python 2! I thought it was already dead!
On Fri, Sep 13, 2019 at 10:15 AM Miriam Redi <mredi(a)wikimedia.org> wrote:
> Thanks Luca!
>
> And yes Diego, that, plus casting iterables to lists :)
> Just in case it's useful:
> http://ptgmedia.pearsoncmg.com/imprint_downloads/informit/promotions/python…
>
> Best,
>
> M
>
> On Fri, Sep 13, 2019 at 3:55 PM Diego Saez-Trumper <diego(a)wikimedia.org>
> wrote:
>
>> Oh! We are getting old!
>> Thanks for the heads up Luca.
>>
>> I would say that 90% of the migration process is to change:
>> print 'x' to print('x')
>>
>> ;)
>> Best!
>>
>>
>> On Fri, Sep 13, 2019 at 11:35 AM Luca Toscano <ltoscano(a)wikimedia.org>
>> wrote:
>>
>> > Hi everybody,
>> >
>> > as https://www.python.org/doc/sunset-python-2/ says Python 2 is finally
>> > going EOL on January 1st. We (as Analytics team) have a lot of packages
>> > deployed on stat/notebook/hadoop hosts via puppet that should be
>> removed,
>> > but before doing so we'd need to know if anybody of you is currently
>> using
>> > a Python-2-only environment to work/research/test/etc... If so, please
>> > comment in the following task so we'll discuss your use case and
>> possibly
>> > find a Python-3 solution: https://phabricator.wikimedia.org/T204737
>> > In the task we are going to add info about common packages that we know
>> > (keras, tensorflow, pytorch, etc..) to help you migrate to Python 3 as
>> > quickly and painlessly as possible, so if you are interested please
>> > subscribe to the task.
>> >
>> > Thanks in advance!
>> >
>> > Luca (on behalf of the Analytics team)
>> > _______________________________________________
>> > Wiki-research-l mailing list
>> > Wiki-research-l(a)lists.wikimedia.org
>> > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>> >
>> _______________________________________________
>> Wiki-research-l mailing list
>> Wiki-research-l(a)lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>>
> _______________________________________________
> Analytics mailing list
> Analytics(a)lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
--
Aaron Halfaker
Principal Research Scientist
Head of the Scoring Platform team
Wikimedia Foundation
Hi everybody,
as https://www.python.org/doc/sunset-python-2/ says Python 2 is finally
going EOL on January 1st. We (as Analytics team) have a lot of packages
deployed on stat/notebook/hadoop hosts via puppet that should be removed,
but before doing so we'd need to know if anybody of you is currently using
a Python-2-only environment to work/research/test/etc... If so, please
comment in the following task so we'll discuss your use case and possibly
find a Python-3 solution: https://phabricator.wikimedia.org/T204737
In the task we are going to add info about common packages that we know
(keras, tensorflow, pytorch, etc..) to help you migrate to Python 3 as
quickly and painlessly as possible, so if you are interested please
subscribe to the task.
Thanks in advance!
Luca (on behalf of the Analytics team)
Hey folks, I'm helping Rebecca Maung (rmaung(a)wikimedia.org) distribute this
request. Her words below:
The Wikimedia Foundation is asking for your feedback in the annual
Community Insights survey. We want to know how well we are supporting your
work on- and off-wiki, and how we can change or improve things in the
future. The opinions you share will directly affect the current and future
work of the Wikimedia Foundation.
If you are a volunteer developer, and have contributed code to any pieces
of MediaWiki, gadgets, or tools, please complete the survey. It is
available in various languages and will take between 15 and 25 minutes to
complete.
Follow this link to the survey:
https://wikimedia.qualtrics.com/jfe/form/SV_0pSrrkJAKVRXPpj?Target=dev
If you have seen a similar message elsewhere and have already taken the
Community Insights survey, please do not take it twice.
You can find more information about this survey on the project page and see
how your feedback helps the Wikimedia Foundation support contributors like
you. This survey is hosted by a third-party service and governed by this
privacy statement. Please visit our frequently asked questions page to find
more information about this survey.
If you need additional help, send an email to surveys(a)wikimedia.org.
Thank you!
//Johan Jönsson
--
Wow, Kerry! Thank you for taking the time to write all these thoughts out.
I'm asking the question because I'm concerned that the gender balance of
the authors being cited on wikipedia is different from the already quite
bad patterns in academia. My fear is that the citation gender imbalance on
Wikipedia is more pronounced. If so, it is not just perpetuating the
problem, but making it worse by surfacing certain authors and ideas even
more frequently, or hardly at all. I would like to know if this is the
case, and if so, how big the effect is.
In my last message, I mention a study about a set of award-winning
political science books (the researchers study the citation gender
imbalance for that set). I just saw this study today, but I began to think
that it/the set of works--or some similar set of titles--could possibly be
a good place to begin, especially if the original researchers were willing
to share the list of titles/authors/gender/etc that they put
together/worked with. Then it seems it would mostly be a matter of figuring
out how to understand how those titles are cited on Wikipedia--through
either the citation dataset or wikicite--to see if/how the citation
patterns differ (i.e., if the works by women/men are cited more
frequently/at the same rate/less frequently on Wikipedia than what the
researchers found in the original study).
This seems like it would be easier to do than what you propose, but perhaps
the idea is not sound. Until very recently, I thought I could find the
answer in an existing paper! I honestly don't know the best way to get the
answer, but I would like to know the answer and think it's important to
look at.
All of the things you bring up--from the gender of the editor, to the type
of editing being done, to the issues around multiple authors/paywalls/year
of publication/field--complicate the inquiry, and in particular a larger
one. I agree with what you say about doing something small first to see
what's there.
Thanks again for all your thoughts.
Greg
On Thu, Aug 22, 2019 at 9:41 PM <wiki-research-l-request(a)lists.wikimedia.org>
wrote:
> Send Wiki-research-l mailing list submissions to
> wiki-research-l(a)lists.wikimedia.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
> or, via email, send a message with subject or body 'help' to
> wiki-research-l-request(a)lists.wikimedia.org
>
> You can reach the person managing the list at
> wiki-research-l-owner(a)lists.wikimedia.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Wiki-research-l digest..."
>
>
> Today's Topics:
>
> 1. Re: gender balance of wikipedia citations (Greg)
> 2. Re: gender balance of wikipedia citations (Kerry Raymond)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Thu, 22 Aug 2019 18:47:48 -0700
> From: Greg <thenatureprogram(a)gmail.com>
> To: wiki-research-l(a)lists.wikimedia.org
> Subject: Re: [Wiki-research-l] gender balance of wikipedia citations
> Message-ID:
> <
> CAOO9DNvBrw_aLkRUp5kYFLdaLJUEK+ddiz-A09MZwiotAdAmUw(a)mail.gmail.com>
> Content-Type: text/plain; charset="UTF-8"
>
> Hi Leila,
>
> Thanks for your thoughts.
>
> Having just read Troy Vettese's very powerful essay, Sexism in the Academy
> (
> https://nplusonemag.com/issue-34/essays/sexism-in-the-academy/), I wish
> this were a top priority.
>
> I stumbled upon a study today--it came up in the Washington Post's
> excellent series on gender bias in political science. The authors look at a
> set of award winning political science books and the gender imbalance in
> the citations drawn from google scholar. I'm linking the piece here in
> case anyone on this list is interested now, or in the future, in how the
> patterns on Wikipedia compare.
>
> Washington Post piece: "There’s a gender gap in who wins political science
> book awards – and in how widely they’re cited"
>
> https://www.washingtonpost.com/politics/2019/08/22/theres-gender-gap-who-wi…
> "Just as significantly, women’s award-winning books receive fewer scholarly
> citations than men’s award-winning volumes — and this disparity has grown,
> rather than shrunk, in recent years. Over the entire period, APSA
> award-winning volumes by women averaged 43 percent fewer citations per year
> than those by male authors."
>
> Paper: "Winning awards and gaining recognition: An impact analysis of APSA
> section book prizes"
> https://www.sciencedirect.com/science/article/abs/pii/S0362331918300867
>
>
> Best,
> Greg
>
> On Thu, Aug 22, 2019 at 3:44 PM <
> wiki-research-l-request(a)lists.wikimedia.org>
> wrote:
>
> > Send Wiki-research-l mailing list submissions to
> > wiki-research-l(a)lists.wikimedia.org
> >
> > To subscribe or unsubscribe via the World Wide Web, visit
> > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
> > or, via email, send a message with subject or body 'help' to
> > wiki-research-l-request(a)lists.wikimedia.org
> >
> > You can reach the person managing the list at
> > wiki-research-l-owner(a)lists.wikimedia.org
> >
> > When replying, please edit your Subject line so it is more specific
> > than "Re: Contents of Wiki-research-l digest..."
> >
> >
> > Today's Topics:
> >
> > 1. Re: gender balance of wikipedia citations (Greg)
> > 2. Re: gender balance of wikipedia citations (Leila Zia)
> > 3. Wikimania 2019 disinformation meetup follow-up (Leila Zia)
> > 4. Upcoming Research Newsletter (special issue on gender gap
> > research): New papers open for review (Mohammed Sadat Abdulai)
> >
> >
> > ----------------------------------------------------------------------
> >
> > Message: 1
> > Date: Thu, 22 Aug 2019 09:57:15 -0700
> > From: Greg <thenatureprogram(a)gmail.com>
> > To: wiki-research-l(a)lists.wikimedia.org
> > Subject: Re: [Wiki-research-l] gender balance of wikipedia citations
> > Message-ID:
> > <CAOO9DNuSYzzaVwcdqiWA7pj671z3N43XOSwv6DtW0SxWg=
> > L8GQ(a)mail.gmail.com>
> > Content-Type: text/plain; charset="UTF-8"
> >
> > Hi Kerry,
> > Those are all very interesting ways to look at this. I was thinking
> mostly
> > along the lines of your first bullet point, but I'd be interested in
> > research in any of those areas.
> >
> > Thanks,
> > Greg
> >
> > On Thu, Aug 22, 2019 at 5:00 AM <
> > wiki-research-l-request(a)lists.wikimedia.org>
> > wrote:
> >
> > > Send Wiki-research-l mailing list submissions to
> > > wiki-research-l(a)lists.wikimedia.org
> > >
> > > To subscribe or unsubscribe via the World Wide Web, visit
> > > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
> > > or, via email, send a message with subject or body 'help' to
> > > wiki-research-l-request(a)lists.wikimedia.org
> > >
> > > You can reach the person managing the list at
> > > wiki-research-l-owner(a)lists.wikimedia.org
> > >
> > > When replying, please edit your Subject line so it is more specific
> > > than "Re: Contents of Wiki-research-l digest..."
> > >
> > >
> > > Today's Topics:
> > >
> > > 1. gender balance of wikipedia citations (Greg)
> > > 2. Re: gender balance of wikipedia citations (Kerry Raymond)
> > >
> > >
> > > ----------------------------------------------------------------------
> > >
> > > Message: 1
> > > Date: Wed, 21 Aug 2019 20:19:18 -0700
> > > From: Greg <thenatureprogram(a)gmail.com>
> > > To: wiki-research-l(a)lists.wikimedia.org
> > > Subject: [Wiki-research-l] gender balance of wikipedia citations
> > > Message-ID:
> > > <
> > > CAOO9DNtY+oDO5oQrMZeG1NZE-kYNYLWnTD6acHeYTbYeGk8k2Q(a)mail.gmail.com>
> > > Content-Type: text/plain; charset="UTF-8"
> > >
> > > Greetings!
> > >
> > > I was looking for information about the gender balance of Wikipedia
> > > citations and no one I've asked knows of any work on this topic. Do
> you?
> > >
> > > I think this is an important question.
> > >
> > > Here's what I've learned so far:
> > >
> > > Wikipedia citations are currently in the form of text strings. There is
> > > also an initiative to place citations in an annotated structured
> > repository
> > > (wikicite). I do not know the current status of wikicite or if/when
> this
> > > could be used for this inquiry--either to examine all, or a sensible
> > subset
> > > of the citations.
> > >
> > > My perspective is that understanding the gender balance is necessary
> and
> > > urgent. The balance could be better, the same, or worse than the
> citation
> > > balances we already know, and the scale of the effect is quite large.
> > >
> > > Is this a line of inquiry that the wikimedia/wikicite community is
> > > interested in pursuing? If so, what is the best way to get started?
> Does
> > > the WMF have the resources and interest to look into this matter
> inhouse?
> > >
> > > Thanks for your thoughts.
> > >
> > > Greg
> > >
> > >
> > > ------------------------------
> > >
> > > Message: 2
> > > Date: Thu, 22 Aug 2019 13:53:45 +1000
> > > From: "Kerry Raymond" <kerry.raymond(a)gmail.com>
> > > To: "'Research into Wikimedia content and communities'"
> > > <wiki-research-l(a)lists.wikimedia.org>
> > > Subject: Re: [Wiki-research-l] gender balance of wikipedia citations
> > > Message-ID: <00ed01d5589d$33e31ed0$9ba95c70$(a)gmail.com>
> > > Content-Type: text/plain; charset="UTF-8"
> > >
> > > Could you elaborate a bit more on what you mean by the gender balance
> of
> > > citations?
> > >
> > > Are you talking about:
> > >
> > > * proportion of male vs female authors of the source material used as
> > > citations in arbitrary articles>
> > > * the quality/quantity of citations in biography articles of men vs
> > women?
> > > * the quality/quantity of citations in articles that are gendered by
> some
> > > other criteria (e.g. reader interest, romantic comedy vs action film)?
> > >
> > > Kerry
> > >
> > > -----Original Message-----
> > > From: Wiki-research-l [mailto:
> > wiki-research-l-bounces(a)lists.wikimedia.org]
> > > On Behalf Of Greg
> > > Sent: Thursday, 22 August 2019 1:19 PM
> > > To: wiki-research-l(a)lists.wikimedia.org
> > > Subject: [Wiki-research-l] gender balance of wikipedia citations
> > >
> > > Greetings!
> > >
> > > I was looking for information about the gender balance of Wikipedia
> > > citations and no one I've asked knows of any work on this topic. Do
> you?
> > >
> > > I think this is an important question.
> > >
> > > Here's what I've learned so far:
> > >
> > > Wikipedia citations are currently in the form of text strings. There is
> > > also an initiative to place citations in an annotated structured
> > repository
> > > (wikicite). I do not know the current status of wikicite or if/when
> this
> > > could be used for this inquiry--either to examine all, or a sensible
> > subset
> > > of the citations.
> > >
> > > My perspective is that understanding the gender balance is necessary
> and
> > > urgent. The balance could be better, the same, or worse than the
> citation
> > > balances we already know, and the scale of the effect is quite large.
> > >
> > > Is this a line of inquiry that the wikimedia/wikicite community is
> > > interested in pursuing? If so, what is the best way to get started?
> Does
> > > the WMF have the resources and interest to look into this matter
> inhouse?
> > >
> > > Thanks for your thoughts.
> > >
> > > Greg
> > > _______________________________________________
> > > Wiki-research-l mailing list
> > > Wiki-research-l(a)lists.wikimedia.org
> > > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
> > >
> > >
> > >
> > >
> > > ------------------------------
> > >
> > > Subject: Digest Footer
> > >
> > > _______________________________________________
> > > Wiki-research-l mailing list
> > > Wiki-research-l(a)lists.wikimedia.org
> > > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
> > >
> > >
> > > ------------------------------
> > >
> > > End of Wiki-research-l Digest, Vol 168, Issue 11
> > > ************************************************
> > >
> >
> >
> > ------------------------------
> >
> > Message: 2
> > Date: Thu, 22 Aug 2019 10:43:51 -0700
> > From: Leila Zia <leila(a)wikimedia.org>
> > To: Research into Wikimedia content and communities
> > <wiki-research-l(a)lists.wikimedia.org>
> > Subject: Re: [Wiki-research-l] gender balance of wikipedia citations
> > Message-ID:
> > <CAK0Oe2uCo70_=ma2b=2d+fvr4GseEVxOP0sh=
> > ELNOpKdCuUfqA(a)mail.gmail.com>
> > Content-Type: text/plain; charset="UTF-8"
> >
> > Hi Greg,
> >
> > A few comments if you're going to go with "proportion of male vs
> > female authors of the source material used as citations in arbitrary
> > articles":
> >
> > * Please differentiate between sex (female, male, ...) and gender
> > (woman, man, ...). My understanding from your initial email is that
> > you want to stay focused on gender, not sex.
> >
> > * Unless you have reliable sources about the gender of an author, I
> > would not recommend trying to predict what the gender is. (As you may
> > know, this is not uncommon in social media studies, for example, to
> > predict the gender of the author based on their image or their name.
> > These approaches introduce biases and social challenges.)
> >
> > * Re your question about whether WMF has resources to look into this
> > question in-house: I can't speak for the whole of WMF, however, I can
> > share more about the Research team's direction. As part of our future
> > work, we would like to "help contributors monitor violations of core
> > content policies and assess information reliability and bias both
> > granularly and at scale". [1] The question you proposed can fall under
> > assessing bias in content (considering citations as part of the
> > content). I expect us to focus first on the piece about violations of
> > core content policies and information reliability and come back to the
> > bias question later. As a result, we won't have bandwidth to do your
> > proposal in-house at the moment. Sorry about that.
> >
> > I hope this helps.
> >
> > Best,
> > Leila
> >
> > [1] Section 2 of our Knowledge Integrity whitepaper:
> >
> >
> https://upload.wikimedia.org/wikipedia/commons/9/9a/Knowledge_Integrity_-_W…
> >
> >
> > On Thu, Aug 22, 2019 at 9:57 AM Greg <thenatureprogram(a)gmail.com> wrote:
> > >
> > > Hi Kerry,
> > > Those are all very interesting ways to look at this. I was thinking
> > mostly
> > > along the lines of your first bullet point, but I'd be interested in
> > > research in any of those areas.
> > >
> > > Thanks,
> > > Greg
> > >
> > > On Thu, Aug 22, 2019 at 5:00 AM <
> > wiki-research-l-request(a)lists.wikimedia.org>
> > > wrote:
> > >
> > > > Send Wiki-research-l mailing list submissions to
> > > > wiki-research-l(a)lists.wikimedia.org
> > > >
> > > > To subscribe or unsubscribe via the World Wide Web, visit
> > > > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
> > > > or, via email, send a message with subject or body 'help' to
> > > > wiki-research-l-request(a)lists.wikimedia.org
> > > >
> > > > You can reach the person managing the list at
> > > > wiki-research-l-owner(a)lists.wikimedia.org
> > > >
> > > > When replying, please edit your Subject line so it is more specific
> > > > than "Re: Contents of Wiki-research-l digest..."
> > > >
> > > >
> > > > Today's Topics:
> > > >
> > > > 1. gender balance of wikipedia citations (Greg)
> > > > 2. Re: gender balance of wikipedia citations (Kerry Raymond)
> > > >
> > > >
> > > >
> ----------------------------------------------------------------------
> > > >
> > > > Message: 1
> > > > Date: Wed, 21 Aug 2019 20:19:18 -0700
> > > > From: Greg <thenatureprogram(a)gmail.com>
> > > > To: wiki-research-l(a)lists.wikimedia.org
> > > > Subject: [Wiki-research-l] gender balance of wikipedia citations
> > > > Message-ID:
> > > > <
> > > > CAOO9DNtY+oDO5oQrMZeG1NZE-kYNYLWnTD6acHeYTbYeGk8k2Q(a)mail.gmail.com>
> > > > Content-Type: text/plain; charset="UTF-8"
> > > >
> > > > Greetings!
> > > >
> > > > I was looking for information about the gender balance of Wikipedia
> > > > citations and no one I've asked knows of any work on this topic. Do
> > you?
> > > >
> > > > I think this is an important question.
> > > >
> > > > Here's what I've learned so far:
> > > >
> > > > Wikipedia citations are currently in the form of text strings. There
> is
> > > > also an initiative to place citations in an annotated structured
> > repository
> > > > (wikicite). I do not know the current status of wikicite or if/when
> > this
> > > > could be used for this inquiry--either to examine all, or a sensible
> > subset
> > > > of the citations.
> > > >
> > > > My perspective is that understanding the gender balance is necessary
> > and
> > > > urgent. The balance could be better, the same, or worse than the
> > citation
> > > > balances we already know, and the scale of the effect is quite large.
> > > >
> > > > Is this a line of inquiry that the wikimedia/wikicite community is
> > > > interested in pursuing? If so, what is the best way to get started?
> > Does
> > > > the WMF have the resources and interest to look into this matter
> > inhouse?
> > > >
> > > > Thanks for your thoughts.
> > > >
> > > > Greg
> > > >
> > > >
> > > > ------------------------------
> > > >
> > > > Message: 2
> > > > Date: Thu, 22 Aug 2019 13:53:45 +1000
> > > > From: "Kerry Raymond" <kerry.raymond(a)gmail.com>
> > > > To: "'Research into Wikimedia content and communities'"
> > > > <wiki-research-l(a)lists.wikimedia.org>
> > > > Subject: Re: [Wiki-research-l] gender balance of wikipedia citations
> > > > Message-ID: <00ed01d5589d$33e31ed0$9ba95c70$(a)gmail.com>
> > > > Content-Type: text/plain; charset="UTF-8"
> > > >
> > > > Could you elaborate a bit more on what you mean by the gender balance
> > of
> > > > citations?
> > > >
> > > > Are you talking about:
> > > >
> > > > * proportion of male vs female authors of the source material used as
> > > > citations in arbitrary articles>
> > > > * the quality/quantity of citations in biography articles of men vs
> > women?
> > > > * the quality/quantity of citations in articles that are gendered by
> > some
> > > > other criteria (e.g. reader interest, romantic comedy vs action
> film)?
> > > >
> > > > Kerry
> > > >
> > > > -----Original Message-----
> > > > From: Wiki-research-l [mailto:
> > wiki-research-l-bounces(a)lists.wikimedia.org]
> > > > On Behalf Of Greg
> > > > Sent: Thursday, 22 August 2019 1:19 PM
> > > > To: wiki-research-l(a)lists.wikimedia.org
> > > > Subject: [Wiki-research-l] gender balance of wikipedia citations
> > > >
> > > > Greetings!
> > > >
> > > > I was looking for information about the gender balance of Wikipedia
> > > > citations and no one I've asked knows of any work on this topic. Do
> > you?
> > > >
> > > > I think this is an important question.
> > > >
> > > > Here's what I've learned so far:
> > > >
> > > > Wikipedia citations are currently in the form of text strings. There
> is
> > > > also an initiative to place citations in an annotated structured
> > repository
> > > > (wikicite). I do not know the current status of wikicite or if/when
> > this
> > > > could be used for this inquiry--either to examine all, or a sensible
> > subset
> > > > of the citations.
> > > >
> > > > My perspective is that understanding the gender balance is necessary
> > and
> > > > urgent. The balance could be better, the same, or worse than the
> > citation
> > > > balances we already know, and the scale of the effect is quite large.
> > > >
> > > > Is this a line of inquiry that the wikimedia/wikicite community is
> > > > interested in pursuing? If so, what is the best way to get started?
> > Does
> > > > the WMF have the resources and interest to look into this matter
> > inhouse?
> > > >
> > > > Thanks for your thoughts.
> > > >
> > > > Greg
> > > > _______________________________________________
> > > > Wiki-research-l mailing list
> > > > Wiki-research-l(a)lists.wikimedia.org
> > > > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
> > > >
> > > >
> > > >
> > > >
> > > > ------------------------------
> > > >
> > > > Subject: Digest Footer
> > > >
> > > > _______________________________________________
> > > > Wiki-research-l mailing list
> > > > Wiki-research-l(a)lists.wikimedia.org
> > > > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
> > > >
> > > >
> > > > ------------------------------
> > > >
> > > > End of Wiki-research-l Digest, Vol 168, Issue 11
> > > > ************************************************
> > > >
> > > _______________________________________________
> > > Wiki-research-l mailing list
> > > Wiki-research-l(a)lists.wikimedia.org
> > > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
> >
> >
> >
> > ------------------------------
> >
> > Message: 3
> > Date: Thu, 22 Aug 2019 13:36:17 -0700
> > From: Leila Zia <leila(a)wikimedia.org>
> > To: Research into Wikimedia content and communities
> > <wiki-research-l(a)lists.wikimedia.org>
> > Subject: [Wiki-research-l] Wikimania 2019 disinformation meetup
> > follow-up
> > Message-ID:
> > <CAK0Oe2sodYJpkuhSqgo3dtfDr=
> > NQ5EK1TdH16F6BOkTyFho9Rg(a)mail.gmail.com>
> > Content-Type: text/plain; charset="UTF-8"
> >
> > Hi,
> >
> > This message is for those of you who attended the disinformation
> > meet-up [0] in Wikimania 2019 [1] or others who may be interested.
> >
> > * The notes from our meet-up are now posted in the bottom of the page
> [0].
> >
> > * I was tasked to see if space.wmflabs.org is the place for us to
> > continue conversations about this topic. The answer is yes. Thanks to
> > the help of Elena Lappen, we now have a dedicated subcategory for
> > disinformation:
> > https://discuss-space.wmflabs.org/c/research/disinformation . Feel
> > free to subscribe, watch, and/or post new topics if you're involved in
> > this space.
> >
> > * If you are new to this conversation, please read the purpose of the
> > subcategory at
> >
> https://discuss-space.wmflabs.org/t/about-the-disinformation-category/949
> > and welcome! :)
> >
> > Best,
> > Leila
> >
> > [0] https://wikimania.wikimedia.org/wiki/2019:Meetups/Disinformation
> > [1] https://wikimania.wikimedia.org/wiki/2019:Program
> >
> >
> >
> > ------------------------------
> >
> > Message: 4
> > Date: Thu, 22 Aug 2019 22:43:53 +0000 (UTC)
> > From: Mohammed Sadat Abdulai <masssly(a)ymail.com>
> > To: Research Into Wikimedia Content and Communities
> > <wiki-research-l(a)lists.wikimedia.org>
> > Subject: [Wiki-research-l] Upcoming Research Newsletter (special issue
> > on gender gap research): New papers open for review
> > Message-ID: <1625269943.668598.1566513833343(a)mail.yahoo.com>
> > Content-Type: text/plain; charset=UTF-8
> >
> > Hi everyone,
> > We’re preparing for the August 2019 research newsletter and looking for
> > contributors. Please take a look at
> > https://etherpad.wikimedia.org/p/WRN201908 and add your name next to any
> > paper you are interested in covering. Our target publication date is on
> 31
> > August 11:59 UTC. As usual, short notes and one-paragraph reviews are
> most
> > welcome.
> > For the August edition, we are planning a special issue focusing mainly
> > on recent gender gap/gender bias research. (Upcoming special issues
> topics
> > may include health and education.) There are about 20 papers from this
> area
> > on our todo list which will all be covered in the August issue, either
> as a
> > mere list item or - with your help - in form of a more informative
> writeup
> > or review. They include:
> > - Analyzing Gender Stereotyping in Bollywood Movies
> >
> > - Breaking the glass ceiling on Wikipedia| journal
> >
> > - Breastfeeding, Authority, and Genre: Women's Ethos in Wikipedia and
> > Blogs
> >
> > - Cyberfeminism on Wikipedia: Visibility and deliberation in feminist
> > Wikiprojects
> >
> > - Gender and deletion on Wikipedia
> >
> > - Gender imbalance and Wikipedia
> >
> > - Gender Markers in Wikipedia Usernames
> >
> > - How do students trust Wikipedia? An examination across genders
> >
> > - Investigating the Gender Pronoun Gap in Wikipedia
> >
> > - It’s Not What You Think: Gender Bias in Information about Fortune
> > 1000 CEOs on Wikipedia
> >
> > - Mapping and Bridging the Gender Gap: An Ethnographic Study of Indian
> > Wikipedians and Their Motivations to Contribute
> >
> > - People Who Can Take It: How Women Wikipedians Negotiate and Navigate
> > Safety
> >
> > - Redressing Gender Inequities on Wikipedia Through an Editathon
> >
> > - Similar Gaps, Different Origins? Women Readers and Editors at Greek
> > Wikipedia
> >
> > - Simulation Experiments on (the Absence of) Ratings Bias in
> Reputation
> > Systems
> >
> > - The Gendered Presentation of Professions on Wikipedia
> >
> > - Who Counts as a Notable Sociologist on Wikipedia? Gender, Race, and
> > the “Professor Test”
> >
> > - Who Wants to Read This?: A Method for Measuring Topical
> > Representativeness in User Generated Content Systems
> >
> > - Women and Wikipedia. Diversifying Editors and Enhancing Content
> > through Library Edit-a-Thons
> >
> > Masssly and Tilman Bayer
> >
> > [1] Research:Newsletter - Meta[2] WikiResearch (@WikiResearch) on Twitter
> >
> >
> > ------------------------------
> >
> > Subject: Digest Footer
> >
> > _______________________________________________
> > Wiki-research-l mailing list
> > Wiki-research-l(a)lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
> >
> >
> > ------------------------------
> >
> > End of Wiki-research-l Digest, Vol 168, Issue 12
> > ************************************************
> >
>
>
> ------------------------------
>
> Message: 2
> Date: Fri, 23 Aug 2019 14:41:09 +1000
> From: "Kerry Raymond" <kerry.raymond(a)gmail.com>
> To: "'Research into Wikimedia content and communities'"
> <wiki-research-l(a)lists.wikimedia.org>
> Subject: Re: [Wiki-research-l] gender balance of wikipedia citations
> Message-ID: <001001d5596c$fe22a100$fa67e300$(a)gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> Yes, that was my thought. It would be difficult to know the sex (or the
> gender) of an author name on a paper. There would inevitably be a lot that
> you could not determine. And certainly in the sciences multi-author pages
> are the norm and even where you did know the sex/gender of all, do you
> assign some part-score? E.g. 0 for all male, 1 for all female, 0.6 for 3
> women and 2 men.
>
> But I am curious why you are asking the question? That the
> writing/research of women is under-represented in Wikipedia citations? If
> so, without conducting any research, I'd say "yes it is under-represented".
> But my reason would be because women are under-represented as
> writers/researchers in the first place. And certainly the older the
> source, the more likely it is to be written by a man. So to investigate
> gender bias in citations in Wikipedia, you would have to estimate the
> proportion of men/women (or at least their outputs) over time in a given
> discipline and then ask the question, "taking into account of the time of
> publication of a citation and the proportion of men/women active in this
> discipline at that time, do Wikipedia citations show a sex/gender basis?".
> Hmm ... very tricky.
>
> I'd be inclined to suggest starting with a much simpler task. Pick a
> discipline (preferably one with a professional society who can tell your
> their estimate of current male/female ratio over (say) the past 5 years),
> limit the Wikipedia articles to topics in that discipline, and limit the
> citations to those published within the last 5 years. Indeed, perhaps
> limiting it to publications that are principally from the same country(s)
> as the professional society from which you get the data (as clearly
> men/women's participation in any discipline can vary with different
> countries for cultural reasons). Then you have some way to gauge whether
> Wikipedia is showing more or less gender bias in its citations than the
> discipline itself exhibits through publication. Quite a challenge!
>
> And of course, it is not Wikipedia that adds citations. It is individual
> contributor who add citations. Does the sex/gender of the contributor have
> any correlation to any observed bias? Again, the task is made more
> difficult because a lot of Wikipedians don't identify their sex/gender.
>
> The other thing to be alert to is the difference in how (I believe)
> Wikipedians cite compared to researchers. As a researcher, I will of course
> be reading papers in my field all the time and what I read will influence
> my subsequent work. Therefore when I write about my research, my citations
> are referring to papers that I have already read and whose authors may be
> familiar to me from their other work, having met them at a conferences,
> private correspondence, etc. However as a Wikipedian, I am only partially
> operating that way (mostly when I write new articles or significantly
> expand them, that is, when I am doing the research). A lot of the time I am
> adding citations relating to content other people (often new users) have
> added/changed without citation. These come up on my watchlist all the time.
> What do I do? Of course I could revert saying "no citation provided", but
> that's not the way to encourage new contributors nor to grow the
> encyclopedia, so if the information seems plausible (not obviously
> vandalism), I will attempt to find a citation for it (using tools like
> Google and other topic-specialise search tools). This is what I call "lucky
> dip" mode of citing as obviously I have no idea what the source was for the
> original contributor. The sources I find from my search may not already be
> known to me (frequently they are not). Or to summarise, IMHO, researchers
> (or Wikipedians in "new content mode") cite a source already known to them
> and whose authors may be known to them and could consciously or
> unconsciously engage in some discrimination in citation based on sex/gender
> or other criteria, whereas Wikipedians in "updating mode" are likely to be
> citing a source not previously known to them and may be happy just to have
> found a source and are unlikely to be spending a lot of their time
> researching the authors of that source to be extent they could then
> consciously or unconsciously exercise discrimination on sex/gender. If I
> invest any extra effort in such a situations, it's probably because the
> wording of the source is a close match to the Wikipedia article which begs
> the question of copyright violation (which needs to be dealt with by
> deletion or rewriting) or being a Wikipedia mirror (which is obviously not
> an acceptable citation).
>
> So I suspect whether a citation was added by the same contributor as the
> content it supports or a subsequent contributor probably makes a difference
> to the likelihood of conscious/unconscious discrimination.
>
> Also, finally, often Wikipedia cites web pages and other sources that do
> not have any individual authorship, e.g. government websites. Remember that
> Wikipedia prefers open citations over paywalled citations and a lot of the
> publications behind paywalls are individually authored.
>
> Your proposed research has a lot of interesting challenges and a number of
> limitations. I'm not saying don't do it, but I am saying start very small
> and see if you can find any evidence to support your hypothesis before
> embarking on a larger study. Because contributor behaviour is what you are
> trying to study, you probably need to do both quantitative and qualitative
> experiments. E.g. I have described the two modes of citation I do, but I
> cannot say how typical my behaviour is.
>
> Kerry
>
> -----Original Message-----
> From: Wiki-research-l [mailto:wiki-research-l-bounces@lists.wikimedia.org]
> On Behalf Of Leila Zia
> Sent: Friday, 23 August 2019 3:44 AM
> To: Research into Wikimedia content and communities <
> wiki-research-l(a)lists.wikimedia.org>
> Subject: Re: [Wiki-research-l] gender balance of wikipedia citations
>
> Hi Greg,
>
> A few comments if you're going to go with "proportion of male vs female
> authors of the source material used as citations in arbitrary
> articles":
>
> * Please differentiate between sex (female, male, ...) and gender (woman,
> man, ...). My understanding from your initial email is that you want to
> stay focused on gender, not sex.
>
> * Unless you have reliable sources about the gender of an author, I would
> not recommend trying to predict what the gender is. (As you may know, this
> is not uncommon in social media studies, for example, to predict the gender
> of the author based on their image or their name.
> These approaches introduce biases and social challenges.)
>
> * Re your question about whether WMF has resources to look into this
> question in-house: I can't speak for the whole of WMF, however, I can share
> more about the Research team's direction. As part of our future work, we
> would like to "help contributors monitor violations of core content
> policies and assess information reliability and bias both granularly and at
> scale". [1] The question you proposed can fall under assessing bias in
> content (considering citations as part of the content). I expect us to
> focus first on the piece about violations of core content policies and
> information reliability and come back to the bias question later. As a
> result, we won't have bandwidth to do your proposal in-house at the moment.
> Sorry about that.
>
> I hope this helps.
>
> Best,
> Leila
>
> [1] Section 2 of our Knowledge Integrity whitepaper:
>
> https://upload.wikimedia.org/wikipedia/commons/9/9a/Knowledge_Integrity_-_W…
>
>
> On Thu, Aug 22, 2019 at 9:57 AM Greg <thenatureprogram(a)gmail.com> wrote:
> >
> > Hi Kerry,
> > Those are all very interesting ways to look at this. I was thinking
> > mostly along the lines of your first bullet point, but I'd be
> > interested in research in any of those areas.
> >
> > Thanks,
> > Greg
> >
> > On Thu, Aug 22, 2019 at 5:00 AM
> > <wiki-research-l-request(a)lists.wikimedia.org>
> > wrote:
> >
> > > Send Wiki-research-l mailing list submissions to
> > > wiki-research-l(a)lists.wikimedia.org
> > >
> > > To subscribe or unsubscribe via the World Wide Web, visit
> > > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
> > > or, via email, send a message with subject or body 'help' to
> > > wiki-research-l-request(a)lists.wikimedia.org
> > >
> > > You can reach the person managing the list at
> > > wiki-research-l-owner(a)lists.wikimedia.org
> > >
> > > When replying, please edit your Subject line so it is more specific
> > > than "Re: Contents of Wiki-research-l digest..."
> > >
> > >
> > > Today's Topics:
> > >
> > > 1. gender balance of wikipedia citations (Greg)
> > > 2. Re: gender balance of wikipedia citations (Kerry Raymond)
> > >
> > >
> > > --------------------------------------------------------------------
> > > --
> > >
> > > Message: 1
> > > Date: Wed, 21 Aug 2019 20:19:18 -0700
> > > From: Greg <thenatureprogram(a)gmail.com>
> > > To: wiki-research-l(a)lists.wikimedia.org
> > > Subject: [Wiki-research-l] gender balance of wikipedia citations
> > > Message-ID:
> > > <
> > > CAOO9DNtY+oDO5oQrMZeG1NZE-kYNYLWnTD6acHeYTbYeGk8k2Q(a)mail.gmail.com>
> > > Content-Type: text/plain; charset="UTF-8"
> > >
> > > Greetings!
> > >
> > > I was looking for information about the gender balance of Wikipedia
> > > citations and no one I've asked knows of any work on this topic. Do
> you?
> > >
> > > I think this is an important question.
> > >
> > > Here's what I've learned so far:
> > >
> > > Wikipedia citations are currently in the form of text strings. There
> > > is also an initiative to place citations in an annotated structured
> > > repository (wikicite). I do not know the current status of wikicite
> > > or if/when this could be used for this inquiry--either to examine
> > > all, or a sensible subset of the citations.
> > >
> > > My perspective is that understanding the gender balance is
> > > necessary and urgent. The balance could be better, the same, or
> > > worse than the citation balances we already know, and the scale of the
> effect is quite large.
> > >
> > > Is this a line of inquiry that the wikimedia/wikicite community is
> > > interested in pursuing? If so, what is the best way to get started?
> > > Does the WMF have the resources and interest to look into this matter
> inhouse?
> > >
> > > Thanks for your thoughts.
> > >
> > > Greg
> > >
> > >
> > > ------------------------------
> > >
> > > Message: 2
> > > Date: Thu, 22 Aug 2019 13:53:45 +1000
> > > From: "Kerry Raymond" <kerry.raymond(a)gmail.com>
> > > To: "'Research into Wikimedia content and communities'"
> > > <wiki-research-l(a)lists.wikimedia.org>
> > > Subject: Re: [Wiki-research-l] gender balance of wikipedia citations
> > > Message-ID: <00ed01d5589d$33e31ed0$9ba95c70$(a)gmail.com>
> > > Content-Type: text/plain; charset="UTF-8"
> > >
> > > Could you elaborate a bit more on what you mean by the gender
> > > balance of citations?
> > >
> > > Are you talking about:
> > >
> > > * proportion of male vs female authors of the source material used
> > > as citations in arbitrary articles>
> > > * the quality/quantity of citations in biography articles of men vs
> women?
> > > * the quality/quantity of citations in articles that are gendered by
> > > some other criteria (e.g. reader interest, romantic comedy vs action
> film)?
> > >
> > > Kerry
> > >
> > > -----Original Message-----
> > > From: Wiki-research-l
> > > [mailto:wiki-research-l-bounces@lists.wikimedia.org]
> > > On Behalf Of Greg
> > > Sent: Thursday, 22 August 2019 1:19 PM
> > > To: wiki-research-l(a)lists.wikimedia.org
> > > Subject: [Wiki-research-l] gender balance of wikipedia citations
> > >
> > > Greetings!
> > >
> > > I was looking for information about the gender balance of Wikipedia
> > > citations and no one I've asked knows of any work on this topic. Do
> you?
> > >
> > > I think this is an important question.
> > >
> > > Here's what I've learned so far:
> > >
> > > Wikipedia citations are currently in the form of text strings. There
> > > is also an initiative to place citations in an annotated structured
> > > repository (wikicite). I do not know the current status of wikicite
> > > or if/when this could be used for this inquiry--either to examine
> > > all, or a sensible subset of the citations.
> > >
> > > My perspective is that understanding the gender balance is
> > > necessary and urgent. The balance could be better, the same, or
> > > worse than the citation balances we already know, and the scale of the
> effect is quite large.
> > >
> > > Is this a line of inquiry that the wikimedia/wikicite community is
> > > interested in pursuing? If so, what is the best way to get started?
> > > Does the WMF have the resources and interest to look into this matter
> inhouse?
> > >
> > > Thanks for your thoughts.
> > >
> > > Greg
> > > _______________________________________________
> > > Wiki-research-l mailing list
> > > Wiki-research-l(a)lists.wikimedia.org
> > > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
> > >
> > >
> > >
> > >
> > > ------------------------------
> > >
> > > Subject: Digest Footer
> > >
> > > _______________________________________________
> > > Wiki-research-l mailing list
> > > Wiki-research-l(a)lists.wikimedia.org
> > > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
> > >
> > >
> > > ------------------------------
> > >
> > > End of Wiki-research-l Digest, Vol 168, Issue 11
> > > ************************************************
> > >
> > _______________________________________________
> > Wiki-research-l mailing list
> > Wiki-research-l(a)lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
> _______________________________________________
> Wiki-research-l mailing list
> Wiki-research-l(a)lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
>
>
>
> ------------------------------
>
> Subject: Digest Footer
>
> _______________________________________________
> Wiki-research-l mailing list
> Wiki-research-l(a)lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
>
> ------------------------------
>
> End of Wiki-research-l Digest, Vol 168, Issue 13
> ************************************************
>
Thanks once again for writing up your thoughts, Kerry. All very interesting.
Your comment about 'reflection of the real world' caught my eye. I believe
that the real world is moving towards acknowledging that bias exists and
that it won't just go away on its own. I see web-based tools for assessing
the gender balance of citations; I see people studying bias in things like
hiring and promotion, as well as different strategies for addressing it; I
see organizations like VIDA (https://www.vidaweb.org/) counting the number
of female writers in different journals, and journals responding because
the imbalance is known and public. I think the real world is moving towards
acknowledging and proactively addressing inequity. If the Wikipedia
community is not studying its biases and designing tools and strategies for
addressing them, it is not reflecting the world, but lagging behind it.
Frankly, if a few shouting misogynist have a problem with such initiatives,
I don't mind :) It sounds like my citation-tag idea doesn't make too much
sense, but I'd love to hear any other thoughts.
Greg
On Wed, Aug 28, 2019 at 3:27 PM <wiki-research-l-request(a)lists.wikimedia.org>
wrote:
> Send Wiki-research-l mailing list submissions to
> wiki-research-l(a)lists.wikimedia.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
> or, via email, send a message with subject or body 'help' to
> wiki-research-l-request(a)lists.wikimedia.org
>
> You can reach the person managing the list at
> wiki-research-l-owner(a)lists.wikimedia.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Wiki-research-l digest..."
>
>
> Today's Topics:
>
> 1. Re: gender balance of Wikipedia citations (Kerry Raymond)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Thu, 29 Aug 2019 08:26:45 +1000
> From: "Kerry Raymond" <kerry.raymond(a)gmail.com>
> To: "'Research into Wikimedia content and communities'"
> <wiki-research-l(a)lists.wikimedia.org>, <jane023(a)gmail.com>
> Subject: Re: [Wiki-research-l] gender balance of Wikipedia citations
> Message-ID: <006701d55def$aea84d50$0bf8e7f0$(a)gmail.com>
> Content-Type: text/plain; charset="UTF-8"
>
> FWIW, I think there would be pushback against a quality tag that
> highlighted little/no citation of women's work (whether we are talking
> first author or not) in an article. There's two reasons for this. One is
> the misogyny that really does exist within the English Wikipedia
> "community" (those who do most of the shouting and hence decision making);
> they will argue that firstly gender balance of citations doesn't matter,
> secondly it is a reflection of the real world and thirdly that Wikipedia
> has a policy that it is not there to Right Great Wrongs.
>
> More practically, we know that whole-of-article quality tagging doesn't
> tend to have a lot of impact in terms getting people to fix anything,
> compared to more specific tags like "citation needed", "dubious", "says
> who" and so on placed on specific pieces of text. People are much more
> likely to fix a specific problem and then remove the specific tag. Even
> when a person does respond to a generic tag like "more references needed"
> and add in some more references, they rarely remove the generic tag
> thinking "well, there's still plenty of scope here to add more references".
> Who among us is willing to declare "that article is 100% fully referenced
> by reliable sources"? Nobody it seems, it's a tag that lingers forever ...
>
> So I think a specific tag to encourage the expansion of "Bloggs et al"
> citations to full author listings might work. It's a somewhat boring and
> mechanical task to expand "et al" but we do have people who are happy to
> contribute in that way. It might even be possible to build a tool to assist
> them which looks up the paper in WikiCite or Google Scholar etc to extract
> the full author list as published (just as we have tools to make it easier
> to typo and spelling fixes, disambiguate links and so forth). That would
> address the problem of women authors not being first cited and lost in the
> mists of "et al". However, as it is unlikely to be obvious to the average
> contributor that the paper with the full author list of A.B. Brown, C.D.
> Jones, E.F. Smith and G.H. Walker does or doesn't have any female authors,
> so I can't see that it's going to be easy to motivate people to try to find
> additional citations which do have more female authors.
>
> And, as much as gender equity is a wrong I'd like to see rightened, I
> don't want to see campaigns just to "add in more female authored citations"
> (I call this "citation sprinkling") on Wikipedia. A citation has to be
> there because it verifies the information in the article and not to meet a
> gender quota. Remember that for a lot of Wikipedia contributors, academic
> literature is mostly behind a paywall so they can't actually read more than
> the title and abstract at best. A "sprinkling" campaign is likely to see
> citations based only on title and abstract ("well, it sounds like this
> paper which includes a woman author is talking about this topic") but the
> paper may not support the specific claim made in the text (indeed, it might
> say the exact opposite). A sprinkling campaign should only target the
> Further Reading section whose role is:
>
> "The Further reading section of an article contains a bulleted list of a
> reasonable number of works which a reader may consult for additional and
> more detailed coverage of the subject of the article. In articles with
> numerous footnotes, it probably is not obvious which ones are suitable for
> further reading. The "Further reading" section can help the readers by
> listing selected titles without worrying about duplications."
>
> which would avoid the risk of adding a citation that doesn't support the
> specific claims being made in the article. So maybe it would be possible to
> add a "skewed gender balance" tag onto the Further reading section and/or
> External links section whose role is
>
> "Some acceptable links include those that contain further research that is
> accurate and on-topic, information that could not be added to the article
> for reasons such as copyright or amount of detail, or other meaningful,
> relevant content that is not suitable for inclusion in an article for
> reasons unrelated to its accuracy."
>
> The downside is this idea for adding female authors to the Further
> Reading and External Links sections is whether anyone ever looks at them.
> Currently over 50% of Wikipedia hits are now via mobile device. The mobile
> render of a Wikipedia article is not the whole article as you see on
> desktop and laptop but rather you select the sections you want to read, so
> for mobile readers we do know precisely what sections they are opening from
> which we have learned that people in developed countries are not generally
> reading whole articles but specific sections (suggesting seeking answers to
> a specific need rather than a desire to fully appreciate the topic), and
> they don't tend to open anything after the References as a rule, so they
> aren't looking at Further Reading and External links anyway. Are
> desktop/laptop readers looking at them either? We don't really know as they
> get the whole article rendered as a single result and it would really only
> be eye-tracking studies (an expensive type of experiment) that would give
> us this insight with the same accuracy as our mobile data.
>
> Aside, in less developed countries, readers are more likely to read whole
> articles on a mobile device. While the reasons for this different are not
> proven, I'd be prepared to guess at two interlinked hypotheses. Firstly,
> such countries have poorer standards of education so people may be using
> Wikipedia to supplement their limited formal education. Also such countries
> are more likely to be using rote learning in their education system
> (valuing the ability to memorise and reproduce) rather than the more
> problem-solving learning approaches increasingly in use in the education
> systems of more developed countries. That would also explain
> whole-of-article viewing rather than selecting specific sub-sections.
>
> In some ways, I think a better solution might be to try to get Google
> scholar interested in the issue of gender. What if articles listed on
> Google scholar came with a little gender balance score (a bit like hotel
> ratings). One blue star (or some other symbol) for one male author, two
> blue stars (two male authors), one pink and one blue star (first author
> female, second author male), etc. Why I like the idea is that it is a
> simple-to-understand visual aid to draw attention to gender imbalance more
> widely but without a specific call to action (which as I outline above may
> backfire if citations get added for gender balance rather than content). It
> potentially helps address the real world problem which would hopefully flow
> through to Wikipedia. Also Google Scholar is probably a lot better
> resourced to build the tools to do the legwork of determining gender (I
> guess a white star is used when it can't). The risks though that Leia has
> previously mentioned is that automated tools don't do a great job of
> getting gender correct particularly as the tools are often trained on
> limited data sets such as mostly white people making the automated gender
> guessing of non-white people more likely to be incorrect. However, as
> authors can establish their own Google Scholar profile (if the author's
> name is underlined, it's a link to their profile, that's a place where they
> could disclose their gender if they desired or correct Google Scholar's
> mistaken guess or demand that Google Scholar not show their gender
> (whatever should be their choice). Hmm, might it lead to catfishing?
> Authors passing themselves off as a different gender? Hmm ...
>
> Another place we might explore is marking gender in some easily visible
> way is in WikiCite but frankly I know little about that project so cannot
> comment on it nor the merits of doing it there rather than on Google
> scholar. I don't think traditional journal publishers are likely to be keen
> to show gender balance on their own websites as I think they would realise
> it would enable webscraping to reveal their overall gender balance profile,
> leading to some adverse headlines about "Brandname journals worst for
> gender equity". But Google Scholar has less to fear unless it was
> demonstrated that they exhibited stronger gender bias than the journals
> themselves but I would think that Google Scholar aggregates papers without
> any regard to the gender of the authors, but I guess it might not aggregate
> all topic areas equally. For example, if they didn't make much effort to
> include (say) nursing publications (a more female academic discipline) but
> went hard on engineering publications (a more male academic discipline), I
> guess it would skew their author gender balance towards men.
>
> Kerry
>
> -----Original Message-----
> From: Wiki-research-l [mailto:wiki-research-l-bounces@lists.wikimedia.org]
> On Behalf Of Greg
> Sent: Thursday, 29 August 2019 4:06 AM
> To: Research into Wikimedia content and communities <
> wiki-research-l(a)lists.wikimedia.org>; jane023(a)gmail.com
> Subject: Re: [Wiki-research-l] gender balance of Wikipedia citations
>
> Hi Jane,
>
> Thanks for the link. It's clear that there is a lot of work being done,
> and even more left to do.
>
> I've been thinking about what you said about second authors and was
> wondering if instead of fixing it (or in addition to fixing it), it would
> make sense to put some sort of tag on the page itself (like the ones I see
> questioning notability or requests for additional citations). Something
> along the lines of authors missing from a particular citation and how to
> fix that, or no work by women cited in this article (if this is the case).
> It strikes me that by fixing it yourself, you are doing great work, but
> that maybe it also makes sense to spread awareness about these issues to
> the broader editing community so more people are thinking about it/doing
> it. At any rate, I thought I'd float the idea. Such a tag/the response (if
> any), could also be interesting to study, though perhaps something like
> this already exists and I'm just not aware of it, or perhaps there is good
> reason not to do it.
>
> All best,
> Greg
>
> On Tue, Aug 27, 2019 at 5:00 AM <
> wiki-research-l-request(a)lists.wikimedia.org>
> wrote:
>
> > Send Wiki-research-l mailing list submissions to
> > wiki-research-l(a)lists.wikimedia.org
> >
> > To subscribe or unsubscribe via the World Wide Web, visit
> > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
> > or, via email, send a message with subject or body 'help' to
> > wiki-research-l-request(a)lists.wikimedia.org
> >
> > You can reach the person managing the list at
> > wiki-research-l-owner(a)lists.wikimedia.org
> >
> > When replying, please edit your Subject line so it is more specific
> > than "Re: Contents of Wiki-research-l digest..."
> >
> >
> > Today's Topics:
> >
> > 1. Re: gender balance of wikipedia citations (Greg)
> > 2. Re: gender balance of Wikipedia citations (Jane Darnell)
> >
> >
> > ----------------------------------------------------------------------
> >
> > Message: 1
> > Date: Mon, 26 Aug 2019 18:56:12 -0700
> > From: Greg <thenatureprogram(a)gmail.com>
> > To: Isaac Johnson <isaac(a)wikimedia.org>
> > Cc: Research into Wikimedia content and communities
> > <wiki-research-l(a)lists.wikimedia.org>
> > Subject: Re: [Wiki-research-l] gender balance of wikipedia citations
> > Message-ID:
> > <
> > CAOO9DNv92bVR2COT2XmpHDU5kJOvD0yD3bahG+6Fkuma+HYDEg(a)mail.gmail.com>
> > Content-Type: text/plain; charset="UTF-8"
> >
> > Thanks, Isaac and Federico. These notes and links are very
> > helpful--and will require some time to process. As for how many years
> > I have to work on this, I'm retired! In truth, I keep hoping that
> > someone on this list will express interest in working on these
> > matters. The questions are all very interesting and quite relevant.
> > The idea of studying removed citations is both complex and compelling.
> >
> > Greg
> >
> > On Mon, Aug 26, 2019 at 6:49 AM Isaac Johnson <isaac(a)wikimedia.org>
> wrote:
> >
> > > Regarding data, I have not been a part of these projects but I think
> > > that I can help a bit with working links:
> > > * The (I believe) original dataset can also be found here:
> > >
> > https://analytics.wikimedia.org/datasets/archive/public-datasets/all/m
> > wrefs/
> > > * A newer version of this dataset was produced that also included
> > > information about whether the source was openly available and its
> topic:
> > > ** Meta page:
> > >
> > https://meta.wikimedia.org/wiki/Research:Towards_Modeling_Citation_Qua
> > lity
> > > ** Figshare:
> > >
> > https://figshare.com/articles/Accessibility_and_topics_of_citations_wi
> > th_identifiers_in_Wikipedia/6819710
> > >
> > > On Mon, Aug 26, 2019 at 3:53 AM Federico Leva (Nemo)
> > > <nemowiki(a)gmail.com
> > >
> > > wrote:
> > >
> > >> Greg, 22/08/19 06:19:
> > >> > I do not know the current status of wikicite or if/when this
> > >> > could be used for this inquiry--either to examine all, or a
> > >> > sensible
> > >> subset
> > >> > of the citations.
> > >>
> > >> If I see correctly, you still did not receive an answer on the data
> > >> available.
> > >>
> > >> It's true that the Figshare item for <
> > >>
> > https://meta.wikimedia.org/wiki/Research:Scholarly_article_citations_i
> > n_Wikipedia
> > >
> > >>
> > >> was deleted (I've asked about it on the talk page), but it's
> > >> trivial to run https://pypi.org/project/mwcites/ and extract the
> > >> data yourself, at least for citations which use an identifier.
> > >>
> > >> Some example datasets produced this way:
> > >> https://zenodo.org/record/15871
> > >> https://zenodo.org/record/55004
> > >> https://zenodo.org/record/54799
> > >>
> > >> Once you extract the list of works, the fun begins. You'll need to
> > >> intersect with other data sources (Wikidata, ORCID, other?) and
> > >> account for a number of factors until you manage to find a subset
> > >> of the data which has a sufficiently high signal:noise ratio. For
> > >> instance you might need to filter or normalise by
> > >> * year of publication (some year recent enough to have good data
> > >> but old enough to allow the work to be cited elsewhere, be archived
> > >> after embargos);
> > >> * country or institution (some probably have better ORCID
> > >> coverage);
> > >> * field/discipline and language;
> > >> * open access status (per Unpaywall);
> > >> * number of expected pageviews and clicks (for instance using
> > >> <https://wikitech.wikimedia.org/wiki/Analytics/AQS/Pageviews> and <
> > https://meta.wikimedia.org/wiki/Research:Wikipedia_clickstream#Release
> > s>;
> > >>
> > >> a link from 10k articles on asteroids or proteins is not the same
> > >> as being the lone link from a popular article which is not the same
> > >> as a link buried among a thousand others on a big article);
> > >> * time or duration of the addition (with one of the various diff
> > >> extraction libraries, content persistence data or possibly
> > >> historical eventstream if such a thing is available).
> > >>
> > >> To avoid having to invent everything yourself, maybe you can reuse
> > >> the method of some similar study, for instance the one on the open
> > >> access citation advantage or one of the many which studied the
> > >> gender imbalance of citations and peer review in journals.
> > >>
> > >> However, it's very possible that the noise is just too much for a
> > >> general computational method. You might consider a more manual
> > >> approach on a sample of relevant events, for instance the *removal*
> > >> of citations, which is in my opinion more significant than the
> > >> addition.* You might extract all the diffs which removed a citation
> > >> from an article in the last N years (probably they'll be in the
> > >> order of 10^5 rather than 10^6), remove some massive events or
> > >> outliers, sample 500-1000 of them randomly and verify the required
> data manually.
> > >>
> > >> As usual it will be impossible to have an objective assessment of
> > >> whether that citation was really (in)appropriate in that context
> > >> according to the (English or whatever) Wikipedia guidelines. To
> > >> test that too, you should replicate one of the various studies of
> > >> the gender imbalance of peer review, perhaps one of those which
> > >> tried to assess the impact of a double blind peer review system on
> the gender imbalance.
> > >> However, because the sources are already published, you'd need to
> > >> provide the agendered information yourself and make sure the
> > >> participants perform their assessment in some controlled
> > >> environment where they don't have access to any gendered
> > >> information (i.e. where you cut them off the internet).
> > >>
> > >> How many years do you have to work on this project? :-)
> > >>
> > >> Federico
> > >>
> > >> (*) I might add a citation just because it's the first result a
> > >> popular search engine gives me, after glancing at the abstract and
> > >> maybe the journal home page; but if I remove an existing citation,
> > >> hopefully I've at least assessed its content and made a judgement
> > >> about it, apart from cases of mass removals for specific problems
> > >> with certain articles or publication venues.
> > >>
> > >> _______________________________________________
> > >> Wiki-research-l mailing list
> > >> Wiki-research-l(a)lists.wikimedia.org
> > >> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
> > >>
> > >
> > >
> > > --
> > > Isaac Johnson -- Research Scientist -- Wikimedia Foundation
> > >
> >
> >
> > ------------------------------
> >
> > Message: 2
> > Date: Tue, 27 Aug 2019 08:00:45 +0200
> > From: Jane Darnell <jane023(a)gmail.com>
> > To: Research into Wikimedia content and communities
> > <wiki-research-l(a)lists.wikimedia.org>
> > Subject: Re: [Wiki-research-l] gender balance of Wikipedia citations
> > Message-ID:
> > <CAFVcA-HqVicR0k65J4iox0PD=
> > oc3HBPMZLfXVO5zqkFD+EnSxQ(a)mail.gmail.com>
> > Content-Type: text/plain; charset="UTF-8"
> >
> > Greg,
> > Yes that's what I meant. On Wikipedia you get what you measure, so
> > many Wikipedians are page-creators and page-hit junkies because we can
> > measure that. The trick to motivating editors is giving them other
> > measurements for progress. Here is the link to the Women writers
> > Wikiproject and as you scroll down you can see what is measured.
> > https://en.wikipedia.org/wiki/Wikipedia:WikiProject_Women_writers
> > Jane
> >
> > On Tue, Aug 27, 2019 at 3:39 AM Greg <thenatureprogram(a)gmail.com> wrote:
> >
> > > Thanks for sharing your experience and thoughts, Jane. I did not
> > > know
> > this
> > > was happening--I'm hardly an expert, so that's not surprising, and
> > > yet
> > it's
> > > still very troubling to hear. I'm not sure what you mean by setting
> > > up a Wikiproject. Do you mean of ways for how to study this
> > > gap--i.e., the
> > ideas
> > > that have been floated in this thread to this point? Or are you
> > > thinking
> > of
> > > something else?
> > >
> > > Greg
> > >
> > > On Mon, Aug 26, 2019 at 5:00 AM <
> > > wiki-research-l-request(a)lists.wikimedia.org>
> > > wrote:
> > >
> > > > Send Wiki-research-l mailing list submissions to
> > > > wiki-research-l(a)lists.wikimedia.org
> > > >
> > > > To subscribe or unsubscribe via the World Wide Web, visit
> > > >
> > > > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
> > > > or, via email, send a message with subject or body 'help' to
> > > > wiki-research-l-request(a)lists.wikimedia.org
> > > >
> > > > You can reach the person managing the list at
> > > > wiki-research-l-owner(a)lists.wikimedia.org
> > > >
> > > > When replying, please edit your Subject line so it is more
> > > > specific than "Re: Contents of Wiki-research-l digest..."
> > > >
> > > >
> > > > Today's Topics:
> > > >
> > > > 1. Re: gender balance of Wikipedia citations (WereSpielChequers)
> > > > 2. Re: gender balance of Wikipedia citations (Greg)
> > > > 3. Re: sockpuppets and how to find them sooner (Federico Leva
> > (Nemo))
> > > > 4. Re: gender balance of Wikipedia citations (Jane Darnell)
> > > > 5. Re: gender balance of wikipedia citations (Federico Leva
> > > > (Nemo))
> > > >
> > > >
> > > > ------------------------------------------------------------------
> > > > ----
> > > >
> > > > Message: 1
> > > > Date: Sun, 25 Aug 2019 14:28:25 +0100
> > > > From: WereSpielChequers <werespielchequers(a)gmail.com>
> > > > To: Research into Wikimedia content and communities
> > > > <wiki-research-l(a)lists.wikimedia.org>
> > > > Subject: Re: [Wiki-research-l] gender balance of Wikipedia
> > > > citations
> > > > Message-ID:
> > > > <CAAanWP3qJnMpLB4tr9Eqt4EJLg2kCihkb50UY-d8=
> > > > ShNONhSAA(a)mail.gmail.com>
> > > > Content-Type: text/plain; charset="UTF-8"
> > > >
> > > > Hi Greg,
> > > >
> > > > One of the major step changes in the early growth of the English
> > > Wikipedia
> > > > was when a bot called RamBot created stub articles on US places. I
> > think
> > > > they were cited to the census. Others have created articles on
> > > > rivers
> > in
> > > > countries and various other topics by similar programmatic means.
> > > Nowadays
> > > > such article creation is unlikely to get consensus on the English
> > > > Wikipedia, but there are some languages which are very open to
> > > > such creations and have them by the million.
> > > >
> > > > I'm not sure if the fastest updating of existing articles is
> > > > automated
> > or
> > > > just semiautomated. But looking at the bot requests page, it
> > > > certainly looks like some people are running such maintenance bots
> > > > "updating GDP
> > by
> > > > country" is a current bot request.
> > > > https://en.wikipedia.org/wiki/Wikipedia:Bot_requests.
> > > >
> > > > I'm not sure how "the ease of a source for purposes of converting
> > > > into
> > a
> > > > table and generating a separate article for each row" relates to
> > gender.
> > > > But i suspect "number of times cited in wikipedia" deserves less
> > > > kudos
> > > than
> > > > "number of times cited in academia".
> > > >
> > > > WSC
> > > >
> > > > On Sun, 25 Aug 2019 at 05:22, Greg <thenatureprogram(a)gmail.com>
> wrote:
> > > >
> > > > > Thanks again, Kerry. I am hoping that someone with access to
> > > > > more
> > > > resources
> > > > > (knowledge, support, etc) than I have will look into this.
> > > > >
> > > > > A few more thoughts/questions:
> > > > >
> > > > > 1. The link to the citation dataset from the Medium article
> > > > > ("What
> > are
> > > > the
> > > > > ten most cited sources on Wikipedia? Let’s ask the data.") is
> broken.
> > > > > 2. As far as I can tell, every named author in the top ten most
> > > > > cited sources on Wikipedia is male. One piece is by a working
> > > > > group 3. This line from the Medium piece struck me: "Many of
> > > > > these
> > > publications
> > > > > have been cited by Wikipedians across large series of articles
> > > > > using powerful bots and automated tools."
> > > > >
> > > > > Are citations being added by bots? I'm not sure that I
> > > > > understand
> > that
> > > > line
> > > > > correctly.
> > > > >
> > > > > Greg
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > >
> > > >
> > > > ------------------------------
> > > >
> > > > Message: 2
> > > > Date: Sun, 25 Aug 2019 21:16:25 -0700
> > > > From: Greg <thenatureprogram(a)gmail.com>
> > > > To: wiki-research-l(a)lists.wikimedia.org
> > > > Subject: Re: [Wiki-research-l] gender balance of Wikipedia
> > > > citations
> > > > Message-ID:
> > > > <CAOO9DNvGyfvJkzyRq60cSQi-T80mAkUa=
> > > > vCPkzFbEysfGQqnVg(a)mail.gmail.com>
> > > > Content-Type: text/plain; charset="UTF-8"
> > > >
> > > > Thanks, WSC. All very interesting.
> > > >
> > > > I've been thinking about Wiklpedia citations less in terms of
> > > > kudos and more in terms of a feedback loop. The cited sources get
> > > > a significant amount of attention (1 click per 200 pageviews is
> > > > the number I saw recently). When I imagine total Wikipedia
> > > > traffic, that's huge. How
> > many
> > > > students are finding sources this way? How many academics? And how
> > > > many
> > > of
> > > > these citations are finding their way back into academic
> > > > publications
> > via
> > > > this mechanism?
> > > >
> > > > Assuming this is happening to some degree, the gender imbalance of
> > > > the citations is also reflected. If the Wikipedia imbalance is the
> > > > same as
> > > the
> > > > one in academia, that's one thing; if it is better on Wikipedia
> > > > than it
> > > is
> > > > in academia, that's reason to celebrate; if the balance is worse,
> > that's
> > > > concerning. In fact, if the gender imbalance conforms to my fears
> > instead
> > > > of my hopes, and is magnified by the massive website traffic, I
> > > > imagine
> > > it
> > > > could even explain the growth in the citation disparity
> > > > researchers
> > note
> > > in
> > > > their study of political science texts. (I link to that study in a
> > > previous
> > > > post; it was mentioned in the Washington Post recently)
> > > >
> > > > There is a very real possibility that Wikipedia is making the
> > > > citation gender gap worse. I think we need to understand what is
> > > > happening and
> > > take
> > > > immediate action if the news is not good.
> > > >
> > > > Greg
> > > >
> > > > >
> > > > >
> > > > >
> > > >
> > > >
> > > > ------------------------------
> > > >
> > > > Message: 3
> > > > Date: Mon, 26 Aug 2019 10:59:07 +0300
> > > > From: "Federico Leva (Nemo)" <nemowiki(a)gmail.com>
> > > > To: Research into Wikimedia content and communities
> > > > <wiki-research-l(a)lists.wikimedia.org>, Aaron Halfaker
> > > > <ahalfaker(a)wikimedia.org>, Kerry Raymond <
> > > kerry.raymond(a)gmail.com>
> > > > Subject: Re: [Wiki-research-l] sockpuppets and how to find them
> > > > sooner
> > > > Message-ID: <cf2734ff-d2cf-3108-691f-8ecf46125ed7(a)gmail.com>
> > > > Content-Type: text/plain; charset=utf-8; format=flowed
> > > >
> > > > Please everyone avoid using jargon specific to the English
> > > > Wikipedia on this cross-language and cross-wiki mailing list.
> > > >
> > > > Aaron Halfaker, 23/08/19 17:36:
> > > > > I think embeddings[1] would be a nice way to create a signature.
> > > >
> > > > There is some discussion of acceptable user fingerprinting
> > > > (presumably to be available to CheckUsers only), other than the
> > > > usual over-reliance on IP addresses, in particular at <
> > > >
> > >
> >
> https://meta.wikimedia.org/wiki/Talk:IP_Editing:_Privacy_Enhancement_and_Ab…
> > > > >.
> > > >
> > > > Federico
> > > >
> > > >
> > > >
> > > > ------------------------------
> > > >
> > > > Message: 4
> > > > Date: Mon, 26 Aug 2019 10:17:46 +0200
> > > > From: Jane Darnell <jane023(a)gmail.com>
> > > > To: Research into Wikimedia content and communities
> > > > <wiki-research-l(a)lists.wikimedia.org>
> > > > Subject: Re: [Wiki-research-l] gender balance of Wikipedia citations
> > > > Message-ID:
> > > > <CAFVcA-G87k26nBMr=-e-+C8o6eG0KQvVihH=
> > > > f4M40faVNbKkqw(a)mail.gmail.com>
> > > > Content-Type: text/plain; charset="UTF-8"
> > > >
> > > > Greg,
> > > > Thanks for worrying. This is a known problem and yes, Wikipedia
> > > contributes
> > > > to the Gendergap in citations and no, it's not an easy fix, since it
> is
> > > the
> > > > fault of systemic bias in academia. So fewer women are head author on
> > > > scientific publications, and it is generally only the head author
> that
> > > gets
> > > > cited on Wikipedia. This is not just a problem with written works in
> > the
> > > > field of politics. I spend most of my time working on paintings and
> > > their
> > > > documented catalogs, so generally I only notice and fix this problem
> in
> > > art
> > > > catalogs. Women rarely appear as lead author mentioned. I will always
> > add
> > > > them in to descriptions when I add items for their works on Wikidata,
> > > but I
> > > > can not always find them! Sometimes I can't even create items for
> them
> > > > because all I have is a name and a work and nothing else available
> > online
> > > > anywhere. You see this most often with women who spent entire careers
> > > > working at a single institution and the institution doesn't bother to
> > > > promote their work or even list them in exhibition catalogs. With
> luck
> > > > there might be a local obituary, but not always. If you have
> > suggestions
> > > > how to set up a Wikiproject to tackle this it would be a good idea.
> In
> > my
> > > > onwiki experience the Women-in-Red community can be very positive in
> > > their
> > > > response to gendergap-related issues for women writers.
> > > > Jane
> > > >
> > > > On Mon, Aug 26, 2019 at 6:17 AM Greg <thenatureprogram(a)gmail.com>
> > wrote:
> > > >
> > > > > Thanks, WSC. All very interesting.
> > > > >
> > > > > I've been thinking about Wiklpedia citations less in terms of kudos
> > and
> > > > > more in terms of a feedback loop. The cited sources get a
> significant
> > > > > amount of attention (1 click per 200 pageviews is the number I saw
> > > > > recently). When I imagine total Wikipedia traffic, that's huge. How
> > > many
> > > > > students are finding sources this way? How many academics? And how
> > many
> > > > of
> > > > > these citations are finding their way back into academic
> publications
> > > via
> > > > > this mechanism?
> > > > >
> > > > > Assuming this is happening to some degree, the gender imbalance of
> > the
> > > > > citations is also reflected. If the Wikipedia imbalance is the same
> > as
> > > > the
> > > > > one in academia, that's one thing; if it is better on Wikipedia
> than
> > it
> > > > is
> > > > > in academia, that's reason to celebrate; if the balance is worse,
> > > that's
> > > > > concerning. In fact, if the gender imbalance conforms to my fears
> > > instead
> > > > > of my hopes, and is magnified by the massive website traffic, I
> > imagine
> > > > it
> > > > > could even explain the growth in the citation disparity researchers
> > > note
> > > > in
> > > > > their study of political science texts. (I link to that study in a
> > > > previous
> > > > > post; it was mentioned in the Washington Post recently)
> > > > >
> > > > > There is a very real possibility that Wikipedia is making the
> > citation
> > > > > gender gap worse. I think we need to understand what is happening
> and
> > > > take
> > > > > immediate action if the news is not good.
> > > > >
> > > > > Greg
> > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > _______________________________________________
> > > > > Wiki-research-l mailing list
> > > > > Wiki-research-l(a)lists.wikimedia.org
> > > > > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
> > > > >
> > > >
> > > >
> > > > ------------------------------
> > > >
> > > > Message: 5
> > > > Date: Mon, 26 Aug 2019 11:45:09 +0300
> > > > From: "Federico Leva (Nemo)" <nemowiki(a)gmail.com>
> > > > To: Research into Wikimedia content and communities
> > > > <wiki-research-l(a)lists.wikimedia.org>, Greg
> > > > <thenatureprogram(a)gmail.com>
> > > > Subject: Re: [Wiki-research-l] gender balance of wikipedia citations
> > > > Message-ID: <835202af-4653-641e-782e-c619458bdd7f(a)gmail.com>
> > > > Content-Type: text/plain; charset=utf-8; format=flowed
> > > >
> > > > Greg, 22/08/19 06:19:
> > > > > I do not know the current status of wikicite or if/when this
> > > > > could be used for this inquiry--either to examine all, or a
> sensible
> > > > subset
> > > > > of the citations.
> > > >
> > > > If I see correctly, you still did not receive an answer on the data
> > > > available.
> > > >
> > > > It's true that the Figshare item for
> > > > <
> > > >
> > >
> >
> https://meta.wikimedia.org/wiki/Research:Scholarly_article_citations_in_Wik…
> > > >
> > > >
> > > > was deleted (I've asked about it on the talk page), but it's trivial
> to
> > > > run https://pypi.org/project/mwcites/ and extract the data yourself,
> > at
> > > > least for citations which use an identifier.
> > > >
> > > > Some example datasets produced this way:
> > > > https://zenodo.org/record/15871
> > > > https://zenodo.org/record/55004
> > > > https://zenodo.org/record/54799
> > > >
> > > > Once you extract the list of works, the fun begins. You'll need to
> > > > intersect with other data sources (Wikidata, ORCID, other?) and
> account
> > > > for a number of factors until you manage to find a subset of the data
> > > > which has a sufficiently high signal:noise ratio. For instance you
> > might
> > > > need to filter or normalise by
> > > > * year of publication (some year recent enough to have good data but
> > old
> > > > enough to allow the work to be cited elsewhere, be archived after
> > > > embargos);
> > > > * country or institution (some probably have better ORCID coverage);
> > > > * field/discipline and language;
> > > > * open access status (per Unpaywall);
> > > > * number of expected pageviews and clicks (for instance using
> > > > <https://wikitech.wikimedia.org/wiki/Analytics/AQS/Pageviews> and
> > > > <
> > https://meta.wikimedia.org/wiki/Research:Wikipedia_clickstream#Releases
> > > >;
> > > >
> > > > a link from 10k articles on asteroids or proteins is not the same as
> > > > being the lone link from a popular article which is not the same as a
> > > > link buried among a thousand others on a big article);
> > > > * time or duration of the addition (with one of the various diff
> > > > extraction libraries, content persistence data or possibly historical
> > > > eventstream if such a thing is available).
> > > >
> > > > To avoid having to invent everything yourself, maybe you can reuse
> the
> > > > method of some similar study, for instance the one on the open access
> > > > citation advantage or one of the many which studied the gender
> > imbalance
> > > > of citations and peer review in journals.
> > > >
> > > > However, it's very possible that the noise is just too much for a
> > > > general computational method. You might consider a more manual
> approach
> > > > on a sample of relevant events, for instance the *removal* of
> > citations,
> > > > which is in my opinion more significant than the addition.* You might
> > > > extract all the diffs which removed a citation from an article in the
> > > > last N years (probably they'll be in the order of 10^5 rather than
> > > > 10^6), remove some massive events or outliers, sample 500-1000 of
> them
> > > > randomly and verify the required data manually.
> > > >
> > > > As usual it will be impossible to have an objective assessment of
> > > > whether that citation was really (in)appropriate in that context
> > > > according to the (English or whatever) Wikipedia guidelines. To test
> > > > that too, you should replicate one of the various studies of the
> gender
> > > > imbalance of peer review, perhaps one of those which tried to assess
> > the
> > > > impact of a double blind peer review system on the gender imbalance.
> > > > However, because the sources are already published, you'd need to
> > > > provide the agendered information yourself and make sure the
> > > > participants perform their assessment in some controlled environment
> > > > where they don't have access to any gendered information (i.e. where
> > you
> > > > cut them off the internet).
> > > >
> > > > How many years do you have to work on this project? :-)
> > > >
> > > > Federico
> > > >
> > > > (*) I might add a citation just because it's the first result a
> popular
> > > > search engine gives me, after glancing at the abstract and maybe the
> > > > journal home page; but if I remove an existing citation, hopefully
> I've
> > > > at least assessed its content and made a judgement about it, apart
> from
> > > > cases of mass removals for specific problems with certain articles or
> > > > publication venues.
> > > >
> > > >
> > > >
> > > > ------------------------------
> > > >
> > > > Subject: Digest Footer
> > > >
> > > > _______________________________________________
> > > > Wiki-research-l mailing list
> > > > Wiki-research-l(a)lists.wikimedia.org
> > > > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
> > > >
> > > >
> > > > ------------------------------
> > > >
> > > > End of Wiki-research-l Digest, Vol 168, Issue 20
> > > > ************************************************
> > > >
> > > _______________________________________________
> > > Wiki-research-l mailing list
> > > Wiki-research-l(a)lists.wikimedia.org
> > > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
> > >
> >
> >
> > ------------------------------
> >
> > Subject: Digest Footer
> >
> > _______________________________________________
> > Wiki-research-l mailing list
> > Wiki-research-l(a)lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
> >
> >
> > ------------------------------
> >
> > End of Wiki-research-l Digest, Vol 168, Issue 22
> > ************************************************
> >
> _______________________________________________
> Wiki-research-l mailing list
> Wiki-research-l(a)lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
>
>
>
> ------------------------------
>
> Subject: Digest Footer
>
> _______________________________________________
> Wiki-research-l mailing list
> Wiki-research-l(a)lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
>
> ------------------------------
>
> End of Wiki-research-l Digest, Vol 168, Issue 25
> ************************************************
>
Dear all,
I thank you for your efforts. I invite to see my Wikimania report at https://meta.m.wikimedia.org/wiki/Wikimedia_France/Micro-financement/Wikima…. Waiting for the video of my session entitled Wikidata and Health: Current situation and perspectives.
Yours Sincerely,
Houcemeddine Turki (he/him)
Medical Student, Faculty of Medicine of Sfax, University of Sfax, Tunisia
Undergraduate Researcher, UR12SP36
GLAM and Education Coordinator, Wikimedia TN User Group
Member, WikiResearch Tunisia
Member, Wiki Project Med
Member, WikiIndaba Steering Committee
Member, Wikimedia and Library User Group Steering Committee
Co-Founder, WikiLingua Maghreb
Founder, TunSci
____________________
+21629499418