Pursuant to prior discussions about the need for a research
policy on Wikipedia, WikiProject Research is drafting a
policy regarding the recruitment of Wikipedia users to
participate in studies.
At this time, we have a proposed policy, and an accompanying
group that would facilitate recruitment of subjects in much
the same way that the Bot Approvals Group approves bots.
The policy proposal can be found at:
The Subject Recruitment Approvals Group mentioned in the proposal
is being described at:
Before we move forward with seeking approval from the Wikipedia
community, we would like additional input about the proposal,
and would welcome additional help improving it.
Also, please consider participating in WikiProject Research at:
University of Minnesota
I am doing research investigating the role of machine translation in
Wikipedia articles. I am having trouble with how to know if an article has
been deleted from Wikipedia. Specifically, I am getting a list of articles
from the cxtranslation list and I would like to know which articles are no
longer on Wikipedia. I see that there is the deletion log form
<https://en.wikipedia.org/wiki/Special:Log/delete> but is there an API or
some way to access something like this form so I could check if a mass
amount of articles have been deleted?
I have used the Media Wiki API <https://en.wikipedia.org/w/api.php> to get
articles and the API returns missing for some articles, but this does not
seem to be fully accurate for determining if an article has been deleted
because the API has returned 'missing' for articles that do exist.
To summarize, my main question is: given an article language edition and
article title, or an article pageid, is there an API to check if the
article has been deleted?
Any help would be greatly appreciated!
Join the Research Team at the Wikimedia Foundation  for their monthly
Office hours this Tuesday, 2021-11-02, at 12:00-13:00 UTC (5am PT/8am
ET/1pm CET). Please note the time change! We are experimenting with our
Office hours schedules to make our sessions more globally welcoming.
To participate, join the video-call via this link . There is no set
agenda - feel free to add your item to the list of topics in the etherpad
. You are welcome to add questions / items to the etherpad in advance,
or when you arrive at the session. Even if you are unable to attend the
session, you can leave a question that we can address asynchronously. If
you do not have a specific agenda item, you are welcome to hang out and
enjoy the conversation. More detailed information (e.g. about how to
attend) can be found here .
Through these office hours, we aim to make ourselves more available to
answer research related questions that you as Wikimedia volunteer editors,
organizers, affiliates, staff, and researchers face in your projects and
initiatives. Here are some example cases we hope to be able to support you
You have a specific research related question that you suspect you
should be able to answer with the publicly available data and you don’t
know how to find an answer for it, or you just need some more help with it.
For example, how can I compute the ratio of anonymous to registered editors
in my wiki?
You run into repetitive or very manual work as part of your Wikimedia
contributions and you wish to find out if there are ways to use machines to
improve your workflows. These types of conversations can sometimes be
harder to find an answer for during an office hour. However, discussing
them can help us understand your challenges better and we may find ways to
work with each other to support you in addressing it in the future.
You want to learn what the Research team at the Wikimedia Foundation
does and how we can potentially support you. Specifically for affiliates:
if you are interested in building relationships with the academic
institutions in your country, we would love to talk with you and learn
more. We have a series of programs that aim to expand the network of
Wikimedia researchers globally and we would love to collaborate with those
of you interested more closely in this space.
You want to talk with us about one of our existing programs .
Hope to see many of you,
Emily on behalf of the WMF Research Team
Emily Lescak (she / her)
Senior Research Community Officer
The Wikimedia Foundation
The next Wikimedia Research Showcase will be on October 27, 16:30 UTC (9:30am
PT/ 12:30pm ET/ 18:30pm CEST). The Wikimedia Foundation Research Team will
present on knowledge gaps.
Speaker: Wikimedia Foundation Research Team
Title: Automatic approaches to bridge knowledge gaps in Wikimedia projects
Abstract: In order to advance knowledge equity as part of the Wikimedia
Movement’s 2030 strategic direction, the Research team at the Wikimedia
Foundation has been conducting research to “Address Knowledge Gaps” as one
of its main programs. One core component of this program is to develop
technologies to bridge knowledge gaps. In this talk, we give an overview on
how we approach this task using tools from Machine Learning in four
different contexts: section alignment in content translation, link
recommendation in structured editing, image recommendation in multimedia
knowledge gaps, and the equity of the recommendations themselves. We will
present how these models can assist contributors in addressing knowledge
gaps. Finally, we will discuss the impact of these models in applications
deployed across Wikimedia projects supporting different Product initiatives
at the Wikimedia Foundation.
* Section alignment:
* Link recommendation:
* Image recommendation:
* Equity in recommendations:
Janna Layton (she/her)
Administrative Associate - Product & Technology
Wikimedia Foundation <https://wikimediafoundation.org/>
Janna Layton (she/her)
Administrative Associate - Product & Technology
Wikimedia Foundation <https://wikimediafoundation.org/>
If you're a researcher (whether from academia, industry, other sectors,
or independent), you'll probably be interested in participating in the
online session "Scientific greetings", which will be held this Sunday,
31 October, as part of the WikidataCon 2021
<https://www.wikidata.org/wiki/Wikidata:WikidataCon_2021>. In this
condensed session each researcher will have 5 minutes to present what
aspects of Wikidata they're studying or how Wikidata is useful for their
research, find out what other colleagues are working on, and ask for or
We're sending this email to inform you that prior registration is
required to present at this session, so we encourage you to follow the
1. Sign up for the WikidataCon 2021
29-31 Oct) if you haven't already done so. It's free and requires no
2. *Add your name or username and, optionally, other details **here
soon as possible*. Slots will be allocated on a first-come,
If you want to prepare slides, feel free to use this template
If you're planning a pre-recorded session, please upload it to Youtube
or Vimeo (unfortunately, we can't display videos from Commons).
Please feel free to share this email with anyone who might find it
useful, and write to us if you have any questions. We hope you enjoy the
session and the conference.
The organizers of the session,
Tiago, Gabriel and David
The Journal of Web Semantics (JWS) invites submissions for a special
issue on Community-based Knowledge Bases and Knowledge Graphs, edited by
Tim Finin, Sebastian Hellmann, David Martin, and Elena Simperl. (contact
email: cbkb(a)cs.umbc.edu <mailto:email@example.com>) Submissions are due
by November 01, 2021. Please see the JWS post here:
Community-based knowledge bases (KBs) and knowledge graphs (KGs) are
critical to many domains. They contain large amounts of information,
used in applications as diverse as search, question-answering systems,
and conversational agents. They are the backbone of linked open data,
helping connect entities from different datasets. Finally, they create
rich knowledge engineering ecosystems, making significant, empirical
contributions to our understanding of KB/KG science, engineering, and
practices. From here forward, we use "KB" to include both knowledge
bases and knowledge graphs. Also, "KB" and "knowledge" encompass both
ontology/schema and data.
Community-based KBs come in many shapes and sizes, but they tend to
share a number of commonalities:
They are created through the efforts of a group of contributors,
following a set of agreed goals, policies, practices, and quality norms.
They are available under open licenses.
They are central to knowledge-sharing networks bringing together
They serve the needs of a community of users, including, but not
restricted to, their contributor base.
Many draw their content from crowdsourced resources (such as
Examples of community-based KBs include Wikidata, DBpedia, ConceptNet,
GeoNames, FrameNet, and Yago. This special issue will highlight recent
research, challenges, and opportunities in the field of community-based
KBs and the interaction and processes between stakeholders and the KBs.
We welcome papers on a wide variety of topics. Papers that focus on the
participation of a community of contributors are especially encouraged.
Topics of interest
We are looking for studies, frameworks, methods, techniques and tools on
topics such as the following:
The impact of community involvement on characteristics of KBs such
as requirements, design, technology choices, policies, etc. For
example, how are KB characteristics driven by the community and
reflective of the community's needs?
Conversely, the impact of KB characteristics on community
involvement. For example, how do changes in these characteristics
affect the participation and behavior of members of the community?
Organizational challenges and solutions in developing and managing
Technical challenges and solutions in community-based KBs,
concerning a technical area such as:
Representation of knowledge and logical foundations
Reasoning, querying, and constraint-checking
Knowledge preparation (e.g., cleaning, deduplication, alignment,
Maintaining consistency with external sources
Representing and managing metadata (including issues involved in
adding metadata to relation instances)
User interfaces and experience, both for contributing to the KB and
using it, by different user groups.
Implemented metrics and quality tests to guide the community in
improving KG quality and expanding KG coverage.
Achieving and managing knowledge diversity, for instance, in the
form of multilinguality, multi-cultural coverage, multiple points of
view, and a diverse and inclusive contributor base.
Detecting and avoiding malicious, inappropriate, and misleading
content in community-based KBs.
Biases in community-based KBs and their impact on downstream uses of
Community-based KBs in science, medicine, law, government, or other
Handling specialized types of knowledge (such as commonsense,
probabilistic, or linguistic knowledge) in a community setting.
Methods and tools to manage KB evolution, including change
detection, change management, conflict resolution, visualization of
Tools and affordances supporting community or collaborative
activities, including discussions, feedback, decision making, task
Motivations and incentives affecting community participation.
Approaches and metrics for community health, including but not
restricted to community growth or diversity.
Roles and participation profiles in communities building and
Frameworks and approaches to support group decision-making and
Types of Papers
We invite submission of Research, Survey, Ontology, and System papers,
according to the guidelines given at https://www.jws-volumes.com
The Journal of Web Semantics solicits original scientific contributions
of high quality. Following the overall mission of the journal, we
emphasize the publication of papers that combine theories, methods and
experiments from different subject areas in order to deliver innovative
semantic methods and applications. The publication of large-scale
experiments and their analysis is also encouraged to clearly illustrate
scenarios and methods that introduce semantics into existing Web
interfaces, contents and services.
Submission of your manuscript is welcome provided that it, or any
translation of it, has not been copyrighted or published and is not
being submitted for publication elsewhere.
Manuscripts should be prepared for publication in accordance with
instructions given in the JWS guide for authors
The submission and review process will be carried out using Elsevier's
Web-based EM system
<https://www.editorialmanager.com/JOWS/default.aspx>. Please state the
name of the SI in your cover letter and, at the time of submission,
please select “VSI:CBKB” when reaching the Article Type selection.
Upon acceptance of an article, the author(s) will be asked to transfer
copyright of the article to the publisher. This transfer will ensure the
widest possible dissemination of information. Elsevier's liberalpreprint
authors and their institutions to host preprints on their web sites.
Preprints of the articles will be made freely accessible viaJWS First
Final copies of accepted publications will appear in print and at
Elsevier's archival online server.
Submission deadline: November 1, 2021
Author notification: February 7, 2022
Minor revisions due: February 21, 2022
Major revisions due: March 14, 2022
Papers appear on JWS preprint server: May 2, 2022
Publication: Fall or Winter 2022
Tim Finin is the Willard and Lillian Hackerman Chair in Engineering and
a Professor of Computer Science and Electrical Engineering at the
University of Maryland, Baltimore County (UMBC).
Sebastian Hellmann is the head of the “Knowledge Integration and
Language Technologies (KILT)" Competence Center at InfAI, Leipzig. He
also is the executive director and board member of the non-profit
DBpedia Association with over 30 key players
<https://www.dbpedia.org/members/overview/>in the knowledge graph area.
He earned a rank in AMiner’s top 10 of the most influential scholars in
knowledge engineering of the last decade.
David L. Martinis a Research & Development Scientist in Artificial
Intelligence. He has held positions at SRI International, Siri, Inc.,
Apple, Nuance Communications, Samsung Research America, and the
University of California at Santa Cruz. He is a Senior Member of the
Association for the Advancement of Artificial Intelligence, and
currently works as an independent consultant in Silicon Valley, California.
Elena Simperlis professor of computer science at King’s College London,
a Fellow of the British Computer Society and former Turing fellow.
According to AMiner, she is in the top 100 most influential scholars in
knowledge engineering of the last decade, as well as in the Women in AI
2000 ranking. Before joining King’s College, she held positions at the
University of Southampton, as well as in Germany and Austria.
I am pleased to announce that Wikimedia Enterprise's HTML dumps  for
October 17-18th are available for public download; see
https://dumps.wikimedia.org/other/enterprise_html/ for more information. We
expect to make updated versions of these files available around the 1st/2nd
of the month and the 20th/21st of the month, following the cadence of the
standard SQL/XML dumps.
This is still an experimental service, so there may be hiccups from time to
time. Please be patient and report issues as you find them. Thanks!
Ariel "Dumps Wrangler" Glenn
 See https://www.mediawiki.org/wiki/Wikimedia_Enterprise for much more
about Wikimedia Enterprise and its API.