Summary: I have made changes to the introduction description of this
mailing list. This includes changes to what this mailing list is used
for (reflecting more closely the reality of how it's been used for the
past couple of years), more expanded information about other lists
that you may want to be aware of, the mailing list norms, and the
introduction of topics and tags. You can review the updated
description at https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
. As a participant of this mailing list, you are responsible to know
the description of the mailing list and I kindly ask you to review it.
==Who am I?==
I'm one of the admins of this list (and head of Research at Wikimedia
==Why the change?==
On a personal level: I want to be able to communicate with the
research community around Wikimedia projects more often, I don't want
to create a new list for my communications, and I saw the note about
the limits on the frequency of posts in the old description of this
list as something that "kept me out". Inspired by this need, I had a
more comprehensive look at the full description.
The previous version of the description  had a few areas to improve
based on the current needs and realities of this list. I name a couple
of them below:
* We had a note about the frequency of emails to this list to be kept
low. With the stack of technologies available to us today, we can
relax this condition and ask for people to use tags/topics in the
subject of their emails to allow for more emails to come to the list.
* The note on who can post to this list could be improved. It read
"only people who are actively involved in research on Wikimedia
projects should post to this list" while my understanding is that on
this list we want to welcome research related questions from those who
are not actively involved in research, too. For example, community
organizers should feel welcome to ask research-related questions on
==What process did I follow for this change?==
I wrote my proposed changes in an etherpad and sent it to 10 people.
These folks include the two other admins of this list, a couple of
other Research folks from WMF, a few folks from the Wikimedia Research
community, and two editors from the community who are active in the
research space. I heard back from 3 of these folks with thumbs up and
areas for improvement. I incorporated all the suggestions I received.
==What if you have suggestions for improvements?==
Sure. Bring them up on this thread or privately. Please note that I
will be in Wikidata Conference  and after that in our annual
Research Offsite and may not be able to respond to you until
==Where can I see the updated description?==
Please go to https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
to review it.
 Read the previous version of the description below:
The purpose of this mailing list is to discuss scientific research
into the content and the communities of the Wikimedia projects:
Wikipedia, Wiktionary, Wikibooks, Wikisource, Wikiquote, Wikinews,
Wikispecies, and Wikimedia Commons, Meta-Wiki.
Research into the technology of Wikimedia, MediaWiki, should primarily
be discussed on <a
instead. For content or community research projects with a strong
technological component, cross-posting to both lists may be advisable.
Please note that only people who are actively involved in research on
Wikimedia projects should post to this list. Typical on-topic posts
<UL><LI>announcement of a new research project
<LI>discussions of methodology
<LI>questions and answers about related projects
Mailing list traffic should be kept at a reasonably low level. The
list is softly moderated, and individuals posting off-topic material
repeatedly may be removed.
This list is not directly associated with the <a
Research Network</A>, though members of the Network are welcome to
post here if they are involved in research projects relating to
content or community. Internal Wikimedia matters, discussions of new
projects and similar threads should be kept off the list.
Principal Research Scientist, Head of Research
We’re preparing for the October 2019 research newsletter and looking for contributors. Please take a look a thttps://etherpad.wikimedia.org/p/WRN201910 and add your name next to any paper you are interested in covering. Our target publication date is 30 October 11:59 UTC. As usual, short notes and one-paragraph reviews are most welcome.
Highlights from this month:
- Science Is Shaped by Wikipedia: Evidence From a Randomized Control Trial
- “On Altpedias: partisan epistemics in the encyclopaedias of alternative facts”. In “After the post-truth
- A Clockwork Wikipedia: From a Broad Perspective to a Case Study
- A deep learning-based quality assessment model of collaboratively edited documents: A case study of Wikipedia
- Finding Synonymous Attributes in Evolving Wikipedia Infoboxes
- GeneDB and Wikidata
- Reader Engagement with Wikipedia’s Medical Content
- Science Is Shaped by Wikipedia: Evidence From a Randomized Control Trial
- Sepsis information-seeking behaviors via Wikipedia between 2015 and 2018: A mixed methods retrospective observational study
- The Global Popularity of William Shakespeare in 303 Wikipedias
Masssly and Tilman Bayer
The September 2019 issue of the Wikimedia Research Newsletter is out:
In this issue:
1 "Reducing Procrastination While Improving Performance: A Wiki-powered Experiment with Students"2 The Importance of Wikipedia in Assessing News Source Credibility3 Wikipedia Topic Assessment4 OpenSym 20194.1 First literature survey of Wikidata quality research4.2 "Article Quality Classification on Wikipedia: Introducing Document Embeddings and Content Features"4.3 "When Humans and Machines Collaborate: Cross-lingual Label Editing in Wikidata"4.4 "Approving automation: analyzing requests for permissions of bots in Wikidata"4.5 "Dwelling on Wikipedia: Investigating Time Spent by Global Encyclopedia Readers"4.6 "Visualization of the Evolution of Collaboration and Communication Networks in Wikis"4.7 Wikitribune navigating "challenges of collaborative evidence-based journalism"5 Conferences and events6 Other recent publications6.1 "Eliciting New Wikipedia Users' Interests via Automatically Mined Questionnaires: For a Warm Welcome, Not a Cold Start"6.2 Shocks make both newcomers and experienced editors contribute more6.3 "Crosslingual Document Embedding As Reduced-Rank Ridge Regression"6.4 "Framing the Holocaust Online: Memory of the Babi Yar Massacres on Wikipedia"6.5 "Framing the Holocaust in popular knowledge: 3 articles about the Holocaust in English, Hebrew and Polish Wikipedia"
*** 15 recent publications were covered or listed in this issue ***
Masssly and Tilman Bayer
Wikimedia Research Newsletterhttps://meta.wikimedia.org/wiki/Research:Newsletter/* Follow us on Twitter: @WikiResearch
* Like us on Facebook: Facebook.com/WikiResearch/
* Receive this newsletter by mail: Research-newsletter Mailing List - Wikimedia
CFP: International Conference on Social Media and Society (#SMSociety)2020
Theme – “Diverse Voices: Promises and Perils of Social Media For Diversity”
Join us on July 22–24, 2020 for the 11th annual International Conference on
Social Media and Society (#SMSociety)
The conference is an interdisciplinary gathering of leading social media
researchers, practitioners, and analysts from around the world. The 2020
conference is hosted by the College of Communication
, Studio Chi
the College of Computing and Digital Media
at DePaul University
Until a decade and a half ago, large media companies and governments lorded
over an oligopoly controlling the means for communicating with the masses.
That oligopoly was (temporarily?) broken by the introduction of social
media with its revolutionary promise to democratize civil discourse and
civil society. It has offered diverse groups the opportunity to connect to
one another and form communities of practice that ultimately can serve to
strengthen their voice. However, in recent years, we have also discovered
that connecting the world via social media leads to new challenges. When so
many diverse voices are brought together on a massive scale, conflict is
common. Interpretation of this conflict is itself diverse: do we see a rise
of incivility or freedom from moral policing? Extremism or idealism?
Distrust or critique? The very same digital tools that amplify voices of
the marginalized can also be used to silence diverse voices online through
online harassment, doxing, trolling, and other measures.
In this context, the *International Conference on Social Media & Society*
scholarly and original submissions that explore key questions and central
issues related (but not limited) to the 2020 theme of “*Diverse Voices:
Promises and Perils of Social Media For Diversity**.*” We welcome research
from a wide range of methodological perspectives employing established
quantitative, qualitative, and mixed methods as well as innovative
approaches that cross disciplinary boundaries and expand our understanding
of the current and future trends in social media research, especially
research that seeks to explore questions such as:
- Can human society handle the connection of diverse and divergent
(often conflicting) voices on social media? Is empowering every voice via
social media a net good?
- Under what conditions can social media build bridges across
difference? When does social media reinforce division?
- What social media affordances are good or bad in terms of supporting
diversity, including but not limited to: race, class, gender, sexual,
cultural, political and linguistic diversity?
- What is the role of government regulation? Should social media be
limited to activities that help to build strong societies?
- Is it naive to think that giving everyone a voice will result in
- And what is the role of social media in all of these? And how can
social media platforms be used to empower marginalized voices?
- Can AI and other automated tools help social media consumers and
producers to overcome these challenges and provide online spaces for
engagement with diverse voices? If so, how?
- *Full papers** (6-10 pages) Due: *Jan. 27, 2020*
- *WIP papers* (1000-word extended abstract) Due: *Jan. 27, 2020*
- *Panels, Workshops, & Posters* Due: *Mar. 16, 2020*
*Full papers presented at the Conference will be published in the
conference proceedings by *ACM International Conference Proceeding Series*
will be available in the ACM Digital Library.
*ABOUT THE CONFERENCE*
The International Conference on Social Media & Society
is an annual gathering of leading social media researchers from around the
world. It is the premier venue for sharing and discovering new
peer-reviewed interdisciplinary research on how social media affects
society. Organized by the Social Media Lab
at Ted Rogers School of Management
at Ryerson University
#SMSociety provides participants with opportunities to exchange ideas,
present original research, learn about recent and ongoing studies, and
network with peers.
The conference’s intensive three-day program features hands-on workshops,
full papers, work-in-progress papers, panels, and posters. The wide-ranging
topics in social media showcase research from scholars working in many
fields including Communication, Computer Science, Education, Journalism,
Information Science, Management, Political Science, Psychology and
*TOPICS OF INTEREST (Not Exhaustive)*
SOCIAL MEDIA IMPACT ON SOCIETY
- Trust & credibility
- Political mobilization & engagement
- Extremism & terrorism
- Dis/Mis/Mal-information, aka…”Fake news”
- Politics of hate and oppression
- Health and well-being
SOCIAL MEDIA & BUSINESS
- Brand communities
- Influencers and consumer engagement
- Consumer behavior & social media marketing
- Public & customer relations
- Cybervetting and HR
SOCIAL MEDIA & PUBLIC SECTOR
- Government regulations of social media
- Government social media management
- Adoption, use, strategies and policies
- Citizens’ engagement
- Citizens’ privacy & security concerns
- Public opinions on environmental issues
SOCIAL MEDIA & ACADEMIA
- Alternative metrics
- Learning analytics
- Teaching with social media
- University branding
- Knowledge Translation/management
- Online community detection
- Influential user detection
- Identity and anonymity
- Case studies
THEORIES & METHODS
- Qualitative approaches
- Quantitative approaches
- Mixed methods
- Opinion mining & sentiment analysis
- Social network analysis
- Theoretical models
BIG & SMALL DATA
- Value of small data
- Data mining and analytics
- Sampling issues
- Scalability issues
- Data Access/Scraping
- Data Biases
- Anatoliy Gruzd, Ryerson University, Canada – Conference Chair
- Philip Mai, Ryerson University, Canada – Conference Chair
- Raquel Recuero, Universidade Federal de Pelotas (UFPel), Brazil – Full
- Ángel Hernández-García, Universidad Politécnica de Madrid, Spain –
Full Paper Chair
- Chei Sian Lee, Nanyang Technological University, Singapore – WIP Chair
- James Cook, University of Maine at Augusta, USA – WIP Chair
- Jaigris Hodson, Royal Roads University, Canada – Poster Chair
*HOST COMMITTEE at DePaul University*
- Bree McEwan, Communication Studies, Communication & Technology
- Jill Hopke, Journalism
- Hamed Qahri Saremi, Management Information Systems
- Juan Mundel, Advertising
- Paul Booth, Digital Communication & Media Arts
- Enid Montague, Human Computer Interaction
- Samantha Close, Digital Communication & Media Arts
- Nur Uysal, Public Relations
I thank you for your efforts. I am honoured to inform you that we are managing to launch a research unit dealing with Data Science research in Tunisia. This research unit is named "Data Engineering and Semantics". It belongs to Computer Science and Communications Department, Faculty of Sciences of Sfax, University of Sfax. The group will be led by Dr. Mohamed BEN AOUICHA (https://scholar.google.ca/citations?user=XPVUu-gAAAAJ&hl=fr). He was the person behind the project of University of Sfax about the use of Wikipedia and Wiktionary in semantic technologies, particularly semantic similarity measures. As explained in my previous emails, we need the research unit to have Wikimedia Researchers in the University of Sfax affiliated to an official research structure. The main goal of the unit is to promote and monitor research efforts about Data Engineering and Semantic Technologies in the University of Sfax. Data Enginering and Semantic Technology researches are needed to ameliorate the usage and processing of knowledge bases including wikis. Effectively, we will work in this unit to develop tools to ameliorate linked data quality of Wikidata and to use Wikidata for a variety of purposes including clinical decision support, Natural Language Processing and Education. We will also analyze the output of wikis, identify deficiencies and work to solve them. However, this will not be possible without several letters of support from several institutions. I ask if your research institution can send us a letter of support so that we can have the approval of the Tunisian Ministry of Higher Education to create our new research unit.
Houcemeddine Turki (he/him)
Medical Student, Faculty of Medicine of Sfax, University of Sfax, Tunisia
GLAM and Education Coordinator, Wikimedia TN User Group
Member, Wiki Project Med
Member, WikiIndaba Steering Committee
Member, Wikimedia and Library User Group Steering Committee
The next Research Showcase will be live-streamed next Wednesday, October
16, at 9:30 AM PDT/16:30 UTC.
YouTube stream: https://www.youtube.com/watch?v=KZ35weAVlIU
As usual, you can join the conversation on IRC at #wikimedia-research. You
can also watch our past Research Showcases here:
This month's presentations:
Elections Without Fake: Deploying Real Systems to Counter Misinformation
By Fabrício Benevenuto, Computer Science Department, Universidade Federal
de Minas Gerais (UFMG), Brazil
The political debate and electoral dispute in the online space during the
2018 Brazilian elections were marked by an information war. In order to
mitigate the misinformation problem, we created the project Elections
Without Fake <http://www.eleicoes-sem-fake.dcc.ufmg.br/> and developed a
few technological solutions able to reduce the abuse of misinformation
campaigns in the online space. Particularly, we created a system to monitor
public groups in WhatsApp and a system to monitor ads in Facebook. Our
systems showed to be fundamental for fact-checking and investigative
journalism, and are currently being used by over 150 journalists with
editorial lines and various fact-checking agencies.
More info on second talk by Francesca Spezzano to come
Janna Layton (she, her)
Administrative Assistant - Product & Technology
Wikimedia Foundation <https://wikimediafoundation.org/>
the Analytics team is going to stop HDFS and Yarn services for a
(hopefully) brief time window tomorrow, Tue Oct 15th, from 14:30 to 15:30
We are going to swap the Zookeeper cluster from the one currently used by
all the Kafka production services to a dedicated one within the Analytics
VLAN. More details in https://phabricator.wikimedia.org/T217057
Side effects: the move should be transparent to all users, except the ones
relying on history in Yarn (for example, checking it via yarn.wikimedia.org).
The history is in fact stored in zookeeper, and to keep things simple we
are not copying znodes over to the new cluster. Please let us know if this
impacts you in any way.
As usual, if the time window affects your work please let us know and we'll
find a new maintenance window.
Luca (on behalf of the Analytics team).
A lot of research on Wikipedia is published in English and also uses the
English Wikipedia as source of data or researchers get their participants
via English Wikipedia .
A frequent criticism I meet when discussing such research with non-en.wp
community members is that their Wikipedia is different and the results of
en.wp base research are problematic/incomparable/totally useless.
So I want to ask:
- Do you know of research comparing different Wikis, preferably across
language versions? 
- How would you deal with such criticism, particularly of the "if it is not
about 'my' wp it is useless"-kind ?
 Plausible due to academi fields, particularly Computer Science,
publishing mainly in english, size and WMF as actor being US-based.
 I know of »revisiting "The Rise and Decline" in a Population of Peer
Production Projects« (https://dl.acm.org/citation.cfm?id=3173929),
comparing different Wikia-Wikis; Research like "limits of
self-organization" (https://firstmonday.org/article/view/1405/1323) that
refer to general principles of peer production. Comparisons of Wikipedias
across languages and the impact of their different contexts, languages and
regulations would be very interesting to me.
 I'm aware that making heterogeneous things comparable is seen as a core
academic/scientific activity in STS research (Law, SL Star, Turnbull…) so I
do not want to say, transfer to a different setting is not a problem – but
it is certainly not "totally useless" either.
UX Design/ Research
Wikimedia Deutschland e. V. | Tempelhofer Ufer 23-24 | 10963 Berlin
Tel. (030) 219 158 26-0
Unsere Vision ist eine Welt, in der alle Menschen am Wissen der Menschheit
teilhaben, es nutzen und mehren können. Helfen Sie uns dabei!
Wikimedia Deutschland — Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/029/42207.
Several publications are in preparation... I will get back to you...
Thanks for your interest,
2222 Rue Hochelaga,
Montréal, QC H2K 4N8
+1 (514) 649 0755
*Avis de confidentialité*
Le présent message transmis par télécopie est confidentiel, et son contenu
peut être protégé par le secret professionnel. Il est à l’usage exclusif de
son ou sa destinataire. Toute autre personne est par les présentes avisée
qu’il lui est strictement interdit de le diffuser, de le distribuer ou de
le reproduire. Si la ou le destinataire ne peut être joint ou vous est
inconnu, nous vous prions d’en informer immédiatement l’expéditeur ou
l’expéditrice et de détruire ce message et toute copie de celui-ci.
Le sam. 28 sept. 2019 à 08:00, <wiki-research-l-request(a)lists.wikimedia.org>
a écrit :
> Send Wiki-research-l mailing list submissions to
> To subscribe or unsubscribe via the World Wide Web, visit
> or, via email, send a message with subject or body 'help' to
> You can reach the person managing the list at
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Wiki-research-l digest..."
> Today's Topics:
> 1. Re: Standardization of Wikipedia articles according to the
> lexical constancy of their introductions and body texts (Morten Wang)
> Message: 1
> Date: Fri, 27 Sep 2019 07:47:00 -0700
> From: Morten Wang <nettrom(a)gmail.com>
> To: Research into Wikimedia content and communities
> Subject: Re: [Wiki-research-l] Standardization of Wikipedia articles
> according to the lexical constancy of their introductions and body
> Content-Type: text/plain; charset="UTF-8"
> Hi Ludovic,
> This work sounds interesting, I'm looking forward to learning more about it
> as your papers come out!
> I read through the post on LinkedIn and from how I interpret it you are
> only looking at two quality classes (Features Articles vs other articles).
> This seems somewhat odd to me and I'd like to know more about why? The
> current trend when it comes to predicting article quality in the English
> Wikipedia does not limit the prediction problem to just FAs vs the rest,
> instead it's using the whole quality scale. See the list below for some
> papers along this line of research.
> I'm also really curious about what "standardize the cognitive accessibility
> of Wikipedia" means? That might mean more than just "article quality",
> hence why I'm asking.
> All that being said, I think the approach sounds interesting and probably
> adds some signal, so I'm curious to learn more how it works and performs.
> - Warncke-Wang, M., Cosley, D., & Riedl, J. Tell me more: an actionable
> quality model for Wikipedia. OpenSym/WikiSym 2013. [We argue that
> isn't useful because contributors can't change it]
> - Warncke-Wang, M., Ayukaev, V. R., Hecht, B., & Terveen, L. G. The
> success and failure of quality improvement projects in peer production
> communities. CSCW 2015. [See the Appendix for details of the improved
> and how to get good training data]
> - https://www.mediawiki.org/wiki/ORES builds upon the 2015 paper and is
> a readily accessible API, reference datasets are available on figshare
> also in the GitHub repository
> <https://github.com/wikimedia/articlequality>. Now the benchmark to
> compare against, as in the three other papers listed below.
> - Dang, Q. V., & Ignat, C. L. Measuring quality of collaboratively
> edited documents: the case of Wikipedia. CIC 2016. [Shows that adding
> readability features can improve predictions]
> - Dang, Q. V., & Ignat, C. L. An end-to-end learning solution for
> assessing the quality of Wikipedia articles. OpenSym 2017. [Shows the
> performance of RNNs, also contains an important discussion of
> interpretability, etc]
> I also came across this recent paper by Schmidt and Zangerle that reports
> significant improvements, but haven't yet had the time to read the paper
> - Schmidt, M., & Zangerle, E. Article quality classification on
> Wikipedia: introducing document embeddings and content features. OpenSym
> 1. Typically without A-class articles due to how few of them they are.
> On Mon, 23 Sep 2019 at 13:09, Ludovic Bocken <lbocken(a)gmail.com> wrote:
> > Hello,
> > I am finishing my PhDs and I think that you could be interested in my
> > main work about the quality of Wikipedia :
> > and in a future collaboration.
> > I would be very grateful for your feedbacks ! Several publications are in
> > preparation... Let me know if you are interested in following this
> > thread...
> > Have a nice week,
> > Ludovic BOCKEN
> > lbocken(a)gmail.com
> > www.ludovicbocken.com
> > Skype: ludovic.bocken
> > http://www.linkedin.com/in/ludovicbocken
> > 2222 Rue Hochelaga,
> > Montréal, QC H2K 4N8
> > +1 (514) 649 0755
> > *Avis de confidentialité*
> > Le présent message transmis par télécopie est confidentiel, et son
> > peut être protégé par le secret professionnel. Il est à l’usage exclusif
> > son ou sa destinataire. Toute autre personne est par les présentes avisée
> > qu’il lui est strictement interdit de le diffuser, de le distribuer ou de
> > le reproduire. Si la ou le destinataire ne peut être joint ou vous est
> > inconnu, nous vous prions d’en informer immédiatement l’expéditeur ou
> > l’expéditrice et de détruire ce message et toute copie de celui-ci.
> > _______________________________________________
> > Wiki-research-l mailing list
> > Wiki-research-l(a)lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
> Subject: Digest Footer
> Wiki-research-l mailing list
> End of Wiki-research-l Digest, Vol 169, Issue 12