We’re preparing for the June 2021 research newsletter and looking for
*Because the May issue of the Wikipedia Signpost (whom we're co-publishing
with) had to be canceled, we skipped last month. But we will resume with
this June issue, due out this Sunday. One focus will be papers presented
recently at Wikiworkshop 2021.*
Please take a look at https://etherpad.wikimedia.org/p/WRN202106 and add
your name next to any paper you are interested in covering. Our target
time is 27 June 20:00 UTC. If you can't make this deadline but would like
to cover a particular paper in the subsequent issue, leave a note next to
the paper's entry.
As usual, short notes and one-paragraph reviews are most welcome.
- A Brief Analysis of Bengali Wikipedia’s Journey to 100,000 Articles
- Assessing the quality of health-related Wikipedia articles with
generic and specific metrics
- Bridging the Gender Gap: A research study on Indian Language Wikimedia
- Characterizing Opinion Dynamics and Group Decision Making in Wikipedia
- Do I Trust this Stranger? Generalized Trust and the Governance of
- Fast Linking of Mathematical Wikidata Entities in Wikipedia Articles
Using Annotation Recommendation
- Inferring Sociodemographic Attributes of Wikipedia Editors:
State-of-the-art and Implications for Editor Privacy
- Information flow on COVID-19 over Wikipedia: A case study of 11
- Language-agnostic Topic Classification for Wikipedia
- Languages of Knowledge Infrastructures: Learnings from Research on
Indian Language Wikimedia Projects
- Negative Knowledge for Open-world Wikidata
- References in Wikipedia: The Editors’ Perspective
- ShExStatements: Simplifying Shape Expressions for Wikidata
- Simple Wikidata Analysis for Tracking and Improving Biographies in
- Structural Analysis of Wikigraph to Investigate Quality Grades of
- The Language of Liberty: A preliminary study
- Towards Ongoing Detection of Linguistic Bias on Wikipedia
- Towards Open-domain Vision and Language Understanding with Wikimedia
- Tracing the Factoids: the Anatomy of Information Re-organization in
- Wikidata Logical Rules and Where to Find Them
- Wikipedia Editor Drop-Off: A Framework to Characterize Editors'
- WikiShark: An Online Tool for Analyzing Wikipedia Traffic and Trends
*Masssly and Tilman Bayer*
 WikiResearch (@WikiResearch) | Twitter
A monographic issue of the spanish journal Área Abierta, vol. 21, no. 2,
2021, entitled "Wikipedia, twenty years of free knowledge", has just
It contains twelve papers on different aspects of Wikipedia and its
complex informational world, from corporate communication to education,
including credibility, social memory or COVID-19.
"Investigación básica es lo que hago cuando no sé lo que estoy haciendo."
"Basic research is what I am doing when I don't know what I am doing."
Wernher von Braun (1957)
Jesús Tramullas, PhD
Dept. Ciencias Documentación // Dept. of Library & Information Science
Universidad de Zaragoza 50009 Zaragoza (España)
Apologies for any duplicates received due to cross-posting. Please help us by forwarding to ensure wide distribution.
The Work in the Age of Intelligent Machines (WAIM) Research Coordination Network is pleased to announce the WAIM Doctoral Research Fellowship program—a competition that aims to recognize and support outstanding graduate research related to the convergence of intelligent machines and the future of work. With funding from the National Science Foundation’s Future of Work at the Human-Technology Frontier<https://www.nsf.gov/eng/futureofwork.jsp> initiative, this fellowship program will confer a select number of $50,000 awards to eligible US doctoral students to enable them to focus solely on their research during the 2021–2022 academic year. In sponsoring this program, we hope to identify the next set of leaders in building the deep and systematic knowledge required to simultaneously consider the technical, individual, group, organizational, and societal issues involved in leveraging today’s expanding technological capabilities to serve both work and workers.
Eligible candidates for this program are late-stage students enrolled in a research-based doctoral program at an accredited US-institution who are engaged in research that embraces the future of work as a socio-technological phenomenon—one that requires attention to both social and technological systems as well as the implications of their interdependencies. Addressing this challenge requires a research approach that expands beyond a delimited focus on autonomous systems qua systems and instead endorses a convergent approach, namely “the deep integration of knowledge, techniques, and expertise from multiple fields to form new and expanded frameworks” (NSF, 2017). As such, we welcome candidates who draw on a wide variety of disciplinary perspectives, including labor studies, law, psychology, computer science, management science, policy studies, anthropology, sociology, learning science, and cognitive science, among others.
Future WAIM Fellows will interact as a cohort to enable peer exchange and learning. The cohort will come together physically at two meetings during the 2021-22 academic year. The first will be a kickoff event, currently planned for September 2021 in Manhattan. The second event will be during the summer of 2022 at the final WAIM Convergence Conference (location tbd). We will also require that each Fellow attend the dissertation defense of at least 2 peer Fellows virtually and will encourage each Fellow to open any public presentation of their research to other members of the cohort as well as members of the larger WAIM network.
Fellowship applications will be evaluated in relation to their potential for intellectual merit, broader impact, and resonance with the WAIM RCN mission and theme<https://waim.network/about>. For the purpose of this call, we define ‘work’ as the mental or physical activity to achieve tangible benefit such as income, profit, or community welfare. We use the phrase ‘intelligent machines’ to refer to computing technologies characterized by autonomy, the ability to learn, and the ability to interact with other systems and with humans. The first cohort of WAIM Fellows will be announced in early August 2021.
To apply, please submit the following materials on the online reviewing system<https://syracuse.infoready4.com/#competitionDetail/1845709> by 15 July 2021 (midnight EDT):
1. Application summarizing a) current thesis status and expected timeline for completion (i.e., dissertation defense), b) potential intellectual merit and broader impact of dissertation research relative to the WAIM RCN theme and goals<https://waim.network/about>, c) thesis committee members and d) agreement to eligibility requirements.
2. A 2500-word summary of dissertation research (not including references).
3. Current CV.
4. Letter of recommendation from dissertation advisor, including confirmation of dissertation research timeline and eligibility for the fellowship.
Please direct any questions to Kevin Crowston, Ingrid Erickson or Jeff Nickerson at fellowship(a)waim.network<mailto:email@example.com>.
Download this call as a PDF<https://waim.network/fellowshippdf>.
Associate Dean for Research, Distinguished Professor of Information Science
School of Information Studies
Editor-in-chief ACM Transactions on Social Computing and Information, Technology & People
+1 (315) 443.1676<tel:+1%20(315)%20443.1676>
348 Hinds Hall, Syracuse, NY 13244
Most recent publications:
Eseryel, U. Y., Crowston, K., & Heckman, R.. (In press). Functional and visionary leadership in self-managing virtual teams. Group & Organization Management. doi: 10.1177/1059601120955034
Jackson, C. B., Østerlund, C., Harandi, M., Crowston, K., & Trouille, L. (2020). Shifting forms of presence: Volunteer learning in online citizen science. Proceedings of the ACM on Human-Computer Interaction, (CSCW), 36. doi: 10.1145/3392841
Eseryel, U. Y., Wei, K., & Crowston, K. (2020). Decision-making processes in community-based free/libre open source software development teams with internal governance: An extension to decision-making theory. Communications of the Association for Information Systems, 46. doi: 10.17705/1CAIS.04620
Jackson, C., Østerlund, C., Crowston, K., Harandi, M., Allen, S., Bahaadini, S., et al. (2020). Teaching citizen scientists to categorize glitches using machine-learning-guided training. Computers In Human Behavior, 105, 106198. doi: 10.1016/j.chb.2019.106198
Check out our research coordination network on Work in the Age of Intelligent Machine: http://waim.network/
Please find the second call for papers for the Wikidata workshop below. I
am looking forward to reading your work related to Wikidata and seeing some
of you there!
The Second Wikidata Workshop
Co-located with the 20th International Conference on Semantic Web (ISWC
Date: October 24 or 25, 2021
The workshop will be held online, afternoon European time.
== Important dates ==
Papers due: Friday, July 30, 2021
Notification of accepted papers: Friday, September 24, 2021
Camera-ready papers due: Monday, October 4, 2021
Workshop date: October 24/25, 2021
== Overview ==
Wikidata is an openly available knowledge base, hosted by the Wikimedia
Foundation. It can be accessed and edited by both humans and machines and
acts as a common structured-data repository for several Wikimedia projects,
including Wikipedia, Wiktionary, and Wikisource. It is used in a variety of
applications by researchers and practitioners alike.
In recent years, we have seen an increase in the number of publications
around Wikidata. While there are several dedicated venues for the broader
Wikidata community to meet, none of them focuses on publishing original,
peer-reviewed research. This workshop fills this gap - we hope to provide a
forum to build this fledgling scientific community and promote novel work
and resources that support it.
The workshop seeks original contributions that address the opportunities
and challenges of creating, contributing to, and using a global,
collaborative, open-domain, multilingual knowledge graph such as Wikidata.
We encourage a range of submissions, including novel research, opinion
pieces, and descriptions of systems and resources, which are naturally
linked to Wikidata and its ecosystem or enabled by it. What we’re less
interested in are works that use Wikidata alongside or in lieu of other
resources to carry out some computational task - unless the work feeds back
into the Wikidata ecosystem, for instance by improving or commenting on
some Wikidata aspect, or suggesting new design features, tools, and
We also encourage submissions on the topic of Abstract Wikipedia,
particularly around collaborative code management, natural language
generation by a community, the abstract representation of knowledge, and
the interaction between Abstract Wikipedia and Wikidata on the one, and
Abstract Wikipedia and the language Wikipedias on the other side.
We welcome interdisciplinary work, as well as interesting applications that
shed light on the benefits of Wikidata and discuss areas of improvement.
The workshop is planned as an interactive half-day event, in which most of
the time will be dedicated to discussions and exchange rather than oral
presentations. For this reason, all accepted papers will be presented in
short talks and accompanied by a poster. All works will be presented
== Topics ==
Topics of submissions include, but are not limited to:
- Data quality and vandalism detection in Wikidata
- Referencing in Wikidata
- Anomaly, bias, or novelty detection in Wikidata
- Algorithms for aligning Wikidata with other knowledge graphs
- The Semantic Web and Wikidata
- Community interaction in Wikidata
- Multilingual aspects in Wikidata
- Machine learning approaches to improve data quality in Wikidata
- Tools, bots, and datasets for improving or evaluating Wikidata
- Participation, diversity, and inclusivity aspects in the Wikidata
- Human-bot interaction
- Managing knowledge evolution in Wikidata
- Abstract Wikipedia
== Submission guidelines ==
We welcome the following types of contributions.
- Full research paper: Novel research contributions (7-12 pages)
- Short research paper: Novel research contributions of smaller scope than
full papers (3-6 pages)
- Position paper: Well-argued ideas and opinion pieces, not yet in the
scope of a research contribution (6-8 pages)
- Resource paper: New dataset or other resources directly relevant to
Wikidata, including the publication of that resource (8-12 pages)
- Demo paper: New system critically enabled by Wikidata (6-8 pages)
Submissions must be as PDF or HTML, formatted in the style of the Springer
Publications format for Lecture Notes in Computer Science (LNCS). For
details on the LNCS style, see Springer’s Author Instructions.
The papers will be peer-reviewed by at least three researchers. Accepted
papers will be published as open access papers on CEUR (we will only
publish to CEUR if the authors agree to have their papers published).
Papers have to be submitted through easychair:
== Proceedings ==
The complete set of papers will be published with the CEUR Workshop
== Organizing committee ==
Lucie-Aimée Kaffee, University of Southampton, lucie.kaffee[[(a)]]gmail.com
Simon Razniewski, Max Planck Institute for Informatics, srazniew[[@]]
Aidan Hogan, University of Chile, ahogan[[(a)]]dcc.uchile.cl
== Programme committee ==
Miriam Redi, Wikimedia Foundation
John Samuel, CPE Lyon
Dennis Diefenbach, University Jean Monet
Lydia Pintscher, Wikimedia Deutschland
Edgar Meij, Bloomberg L.P.
Thomas Pellissier Tanon, Lexistems
Hiba Arnaout, MPI for Informatics
Fabian Suchanek, Télécom ParisTech
Filip Ilievski, ISI
Marco Ponza, Bloomberg L.P.
Heiko Paulheim, University of Mannheim
Cristina Sarasua, University of Zurich
Pavlos Vougiouklis, Huawei Technologies, Edinburgh
Finn Årup Nielsen, Technical University of Denmark
Andrew D. Gordon, Microsoft Research & University of Edinburgh
This month's Wikimedia Research Showcase will be held on Wednesday, June
23, at 16:30 UTC (9:30am PDT). The theme is "AI model governance" with
presentations from Haiyi Zhu of Carnegie Mellon University and Andy Craze
and Chris Albon of the Wikimedia Foundation's Machine Learning Team.
Speaker: Haiyi Zhu (Carnegie Mellon University)
Title: Bridging AI and HCI: Incorporating Human Values into the Development
of AI Technologies
Abstract: The increasing accuracy and falling costs of AI have stimulated
the increased use of AI technologies in mainstream user-facing applications
and services. However, there is a disconnect between mathematically
rigorous AI approaches and the human stakeholders’ needs, motivations, and
values, as well as organizational and institutional realities, contexts,
and constraints; this disconnect is likely to undermine practical
initiatives and may sometimes lead to negative societal impacts. In this
presentation, I will discuss my research on incorporating human
stakeholders’ values and feedback into the creation process of AI
technologies. I will describe a series of projects in the context of the
Wikipedia community to illustrate my approach. I hope this presentation
will contribute to the rich ongoing conversation concerning bridging HCI
and AI and using HCI methods to address AI challenges.
Speaker: Andy Craze, Chris Albon (Wikimedia Foundation, Machine Learning
Title: ML Governance: First Steps
Abstract: The WMF Machine Learning team is upgrading the Foundation's
infrastructure to support the modern machine learning ecosystem. As part of
this work, the team seeks to understand its ethical and legal
responsibilities for developing and hosting predictive models within a
global context. Drawing from previous WMF research related to ethical &
human-centered machine learning, the team wishes to begin a series of
conversations to discuss how we can deploy responsible systems that are
inclusive to newcomers and non-experts, while upholding our commitment to
free and open knowledge.
Janna Layton (she/her)
Administrative Associate - Product & Technology
Wikimedia Foundation <https://wikimediafoundation.org/>
I have recently received approval and can now begin recruiting participants
for a study that seeks to explore the information behaviour of people who
access Wikipedia’s health and medical content. This study applies theories
and models of information behaviour from the discipline of library and
I kindly ask that you share this study widely within your personal and
professional networks as I require 25-30 participants. Participants will be
asked to complete a short online survey and participate in an interview
over Zoom. Participants who complete both the survey and the interview will
receive a $25 CAD gift card to a coffee shop of their choice.
I have attached the recruitment poster here. Details about the study can be
found on the study’s project page
those interested in participating can email wikistudy3(a)gmail.com
Thanks very much,
The Social Data Science program at the University of Oxford is hiring!
Below is the link for an open position in the MSc program in Social Data
Science. This position is suitable for someone in their early or mid-career
stage. It is a three-year fixed term position in the first instance and
funded directly by the Oxford Internet Institute.
Our SDS program is expanding and finding ever new ways to consider the
relationships between the computational and social sciences while tackling
problems of social and political importance, particularly with respect to
social inequality and internet governance. This position will be an
exciting and hopefully rewarding opportunity to expand the scope of our
interdisciplinary program and mentor exceptional students while building
your own research profile.
The post closes July 15th, with shortlisting to follow in the week
thereafter. Please feel free to circulate this post among your networks.
Dr Scott A. Hale
Associate Professor and Senior Research Fellow,
Co-Director of the International Multimodal Communication Centre
Oxford Internet Institute, University of Oxford
SCR Member, St Antony's College, Oxford
Turing Fellow, Alan Turing Institute
The Public Service Media and Public Service Internet Manifesto
The Internet and the media landscape are broken. The dominant commercial
Internet platforms endanger democracy. Despite all the great
opportunities the Internet has offered to society and individuals, the
digital giants have acquired unparalleled economic, political and
cultural power. As currently organised, the Internet separates and
divides instead of creating common spaces for negotiating difference and
disagreement. Democracy requires Public Service Media and a Public
The Public Service Media and Public Service Internet Manifesto:
Please sign the Manifesto:
Occupy the Internet: 200 Media Experts Publish an Alarming Wake Up Call
and Demand a Public Service Internet
Launch event (video):
I'd like to announce a new dataset of Wikipedia pageviews that shows trends
in search engine usage across countries, language editions, internet
browsers, and device operating systems (
https://techblog.wikimedia.org/2021/06/07/searching-for-wikipedia/). We are
releasing this dataset within the context of the Foundational program 
and in particular research focused on Wikimedia's relationship with
external platforms . This dataset fills a large gap in both public
information about global search engine usage and our understanding of the
external platforms that mediate how readers reach Wikipedia.
For context, approximately 50% of Wikipedia pageviews come directly from
clicking on links on search engines (with another 30% coming from clicking
on internal links within Wikipedia and the rest from external sites or
direct navigation) . For those 50% of pageviews coming from search
engines, this dataset (and corresponding dashboard) allows you to examine
which search engines were used and how those patterns vary by country,
Wikipedia language, internet browser, and device operating system.
If you have any questions about these datasets or related projects please
feel free to contact me. More details here: