This email is a reminder to start proposing projects and sessions for this
year’s Wikimania hackathon, taking place on August 13th (the first day of
Wikimania). You may have already seen the announcement email last week with
all the details.
This year’s hackathon will take place in a 24 hrs format covering all time
zones. You can start proposing a session or a project by following the
guidelines here: <
We also encourage folks to reuse the video of a previously recorded session
(in lecture format) to organize a watch party-style session. And, you can
pause during or after for questions and discussions. That way, you have to
do less prep, and one can watch your session at any time during the event.
There isn't a specific focus area proposed this time, so your session or
project needn't fit under a theme. Anything that involves MediaWiki
development or covers a technical area in the Wikimedia ecosystem. Plus, we
welcome sessions that would be helpful to newcomers–new to the event, our
technical spaces, or projects.
If you have any questions, feel free to use the related talk page to
contact the organizing team or join the community conversation on Telegram:
We look forward to your participation!
On behalf of the Wikimania Hackathon organizing team
Senior Developer Advocate
Wikimedia Foundation <https://wikimediafoundation.org/>
Join the Research Team at the Wikimedia Foundation  for their monthly
Office hours this Tuesday, 2021-08-03, at 16:00-17:00 UTC (9am PT/6pm
To participate, join the video-call via this link . There is no set
agenda - feel free to add your item to the list of topics in the etherpad
 (You can do this after you join the meeting, too.), otherwise you are
welcome to also just hang out. More detailed information (e.g. about how to
attend) can be found here .
Through these office hours, we aim to make ourselves more available to
answer some of the research related questions that you as Wikimedia
volunteer editors, organizers, affiliates, staff, and researchers face in
your projects and initiatives. Some example cases we hope to be able to
support you in:
You have a specific research related question that you suspect you
should be able to answer with the publicly available data and you don’t
know how to find an answer for it, or you just need some more help with it.
For example, how can I compute the ratio of anonymous to registered editors
in my wiki?
You run into repetitive or very manual work as part of your Wikimedia
contributions and you wish to find out if there are ways to use machines to
improve your workflows. These types of conversations can sometimes be
harder to find an answer for during an office hour, however, discussing
them can help us understand your challenges better and we may find ways to
work with each other to support you in addressing it in the future.
You want to learn what the Research team at the Wikimedia Foundation
does and how we can potentially support you. Specifically for affiliates:
if you are interested in building relationships with the academic
institutions in your country, we would love to talk with you and learn
more. We have a series of programs that aim to expand the network of
Wikimedia researchers globally and we would love to collaborate with those
of you interested more closely in this space.
You want to talk with us about one of our existing programs .
Hope to see many of you,
Martin on behalf of the WMF Research Team
[Cross-posting from the Wikidata chat]
Following some feedback by Azertus (thanks!), I collected statistics on
the most frequent Web domains that occur in Discogs  and MusicBrainz
. It looks like some of them may be candidates for identifier
property creation, while others stem from a failed match against known
properties, mainly due to inconsistencies in URL match pattern (P8966),
format as a regular expression (P1793), and formatter URL (P1630) values.
You can have a look at them here .
It would be great to gather thoughts on the next steps.
Two main questions:
1. should we go for a property proposal for each of the candidates?
2. what's the best way to fix URL match pattern (P8966), format as a
regular expression (P1793), and formatter URL (P1630) values, so that
the next time we can convert URLs to proper identifiers?
Greetings reliable & underrepresented knowledge sources enthusiasts!
We are Diego de la Hera <https://meta.wikimedia.org/wiki/User:Diegodlh> and
Scann <https://meta.wikimedia.org/wiki/User:Scann>, longtime friends and
collaborators in several open knowledge projects. Recently, the Wikimedia
Foundation awarded us a grant to develop Web2Cit: Visual Editor for Citoid
The product that will be developed through this grant is a visual editor
that allows non-technical users to improve Citoid’s ability to add
references to Wikipedia articles. The aim of the project is to widen the
coverage of underrepresented sources in Wikipedia.
Now, as we start moving with this project, we are putting together an
advisory board that will help us build sustainability and community
involvement for this project. This is an open call for volunteers that want
to help us think through some critical aspects of the project and overall
help us steward it to its current and future success.
Here you can find more information about the current call for members:
*Deadline for applying is August 9th, 2021. *We're looking forward to
hearing from you!
Scann & Diego
We extended the deadline for submissions to the Wikidata Workshop to
*August 6*. We are very much looking forward to your contributions. Please
find more information below.
The Second Wikidata Workshop
Co-located with the 20th International Conference on Semantic Web (ISWC
Date: October 24 or 25, 2021
The workshop will be held online, afternoon European time.
== Important dates ==
Papers due: Friday, August 6, 2021 (EXTENDED)
Notification of accepted papers: Friday, September 24, 2021
Camera-ready papers due: Monday, October 4, 2021
Workshop date: October 24/25, 2021
== Overview ==
Wikidata is an openly available knowledge base, hosted by the Wikimedia
Foundation. It can be accessed and edited by both humans and machines and
acts as a common structured-data repository for several Wikimedia projects,
including Wikipedia, Wiktionary, and Wikisource. It is used in a variety of
applications by researchers and practitioners alike.
In recent years, we have seen an increase in the number of publications
around Wikidata. While there are several dedicated venues for the broader
Wikidata community to meet, none of them focuses on publishing original,
peer-reviewed research. This workshop fills this gap - we hope to provide a
forum to build this fledgling scientific community and promote novel work
and resources that support it.
The workshop seeks original contributions that address the opportunities
and challenges of creating, contributing to, and using a global,
collaborative, open-domain, multilingual knowledge graph such as Wikidata.
We encourage a range of submissions, including novel research, opinion
pieces, and descriptions of systems and resources, which are naturally
linked to Wikidata and its ecosystem or enabled by it. What we’re less
interested in are works that use Wikidata alongside or in lieu of other
resources to carry out some computational task - unless the work feeds back
into the Wikidata ecosystem, for instance by improving or commenting on
some Wikidata aspect, or suggesting new design features, tools, and
We also encourage submissions on the topic of Abstract Wikipedia,
particularly around collaborative code management, natural language
generation by a community, the abstract representation of knowledge, and
the interaction between Abstract Wikipedia and Wikidata on the one, and
Abstract Wikipedia and the language Wikipedias on the other side.
We welcome interdisciplinary work, as well as interesting applications that
shed light on the benefits of Wikidata and discuss areas of improvement.
The workshop is planned as an interactive half-day event, in which most of
the time will be dedicated to discussions and exchange rather than oral
presentations. For this reason, all accepted papers will be presented in
short talks and accompanied by a poster. All works will be presented
== Topics ==
Topics of submissions include, but are not limited to:
- Data quality and vandalism detection in Wikidata
- Referencing in Wikidata
- Anomaly, bias, or novelty detection in Wikidata
- Algorithms for aligning Wikidata with other knowledge graphs
- The Semantic Web and Wikidata
- Community interaction in Wikidata
- Multilingual aspects in Wikidata
- Machine learning approaches to improve data quality in Wikidata
- Tools, bots, and datasets for improving or evaluating Wikidata
- Participation, diversity, and inclusivity aspects in the Wikidata
- Human-bot interaction
- Managing knowledge evolution in Wikidata
- Abstract Wikipedia
== Submission guidelines ==
We welcome the following types of contributions.
- Full research paper: Novel research contributions (7-12 pages)
- Short research paper: Novel research contributions of smaller scope than
full papers (3-6 pages)
- Position paper: Well-argued ideas and opinion pieces, not yet in the
scope of a research contribution (6-8 pages)
- Resource paper: New dataset or other resources directly relevant to
Wikidata, including the publication of that resource (8-12 pages)
- Demo paper: New system critically enabled by Wikidata (6-8 pages)
Submissions must be as PDF or HTML, formatted in the style of the Springer
Publications format for Lecture Notes in Computer Science (LNCS). For
details on the LNCS style, see Springer’s Author Instructions.
The papers will be peer-reviewed by at least three researchers. Accepted
papers will be published as open access papers on CEUR (we will only
publish to CEUR if the authors agree to have their papers published).
Papers have to be submitted through easychair:
== Proceedings ==
The complete set of papers will be published with the CEUR Workshop
== Organizing committee ==
Lucie-Aimée Kaffee, University of Southampton, lucie.kaffee[[(a)]]gmail.com
Simon Razniewski, Max Planck Institute for Informatics, srazniew[[@]]
Aidan Hogan, University of Chile, ahogan[[(a)]]dcc.uchile.cl
== Programme committee ==
Miriam Redi, Wikimedia Foundation
John Samuel, CPE Lyon
Dennis Diefenbach, University Jean Monet
Lydia Pintscher, Wikimedia Deutschland
Edgar Meij, Bloomberg L.P.
Thomas Pellissier Tanon, Lexistems
Hiba Arnaout, MPI for Informatics
Fabian Suchanek, Télécom ParisTech
Filip Ilievski, ISI
Marco Ponza, Bloomberg L.P.
Heiko Paulheim, University of Mannheim
Cristina Sarasua, University of Zurich
Pavlos Vougiouklis, Huawei Technologies, Edinburgh
Finn Årup Nielsen, Technical University of Denmark
Andrew D. Gordon, Microsoft Research & University of Edinburgh
The Journal of Web Semantics (JWS) invites submissions for a special
issue on Community-based Knowledge Bases and Knowledge Graphs, edited by
Tim Finin, Sebastian Hellmann, David Martin, and Elena Simperl. (contact
email: cbkb(a)cs.umbc.edu <mailto:email@example.com>) Submissions are due
by November 01, 2021. Please see the JWS post here:
Community-based knowledge bases (KBs) and knowledge graphs (KGs) are
critical to many domains. They contain large amounts of information,
used in applications as diverse as search, question-answering systems,
and conversational agents. They are the backbone of linked open data,
helping connect entities from different datasets. Finally, they create
rich knowledge engineering ecosystems, making significant, empirical
contributions to our understanding of KB/KG science, engineering, and
practices. From here forward, we use "KB" to include both knowledge
bases and knowledge graphs. Also, "KB" and "knowledge" encompass both
ontology/schema and data.
Community-based KBs come in many shapes and sizes, but they tend to
share a number of commonalities:
They are created through the efforts of a group of contributors,
following a set of agreed goals, policies, practices, and quality norms.
They are available under open licenses.
They are central to knowledge-sharing networks bringing together
They serve the needs of a community of users, including, but not
restricted to, their contributor base.
Many draw their content from crowdsourced resources (such as
Examples of community-based KBs include Wikidata, DBpedia, ConceptNet,
GeoNames, FrameNet, and Yago. This special issue will highlight recent
research, challenges, and opportunities in the field of community-based
KBs and the interaction and processes between stakeholders and the KBs.
We welcome papers on a wide variety of topics. Papers that focus on the
participation of a community of contributors are especially encouraged.
Topics of interest
We are looking for studies, frameworks, methods, techniques and tools on
topics such as the following:
The impact of community involvement on characteristics of KBs such
as requirements, design, technology choices, policies, etc. For
example, how are KB characteristics driven by the community and
reflective of the community's needs?
Conversely, the impact of KB characteristics on community
involvement. For example, how do changes in these characteristics
affect the participation and behavior of members of the community?
Organizational challenges and solutions in developing and managing
Technical challenges and solutions in community-based KBs,
concerning a technical area such as:
Representation of knowledge and logical foundations
Reasoning, querying, and constraint-checking
Knowledge preparation (e.g., cleaning, deduplication, alignment,
Maintaining consistency with external sources
Representing and managing metadata (including issues involved in
adding metadata to relation instances)
User interfaces and experience, both for contributing to the KB and
using it, by different user groups.
Implemented metrics and quality tests to guide the community in
improving KG quality and expanding KG coverage.
Achieving and managing knowledge diversity, for instance, in the
form of multilinguality, multi-cultural coverage, multiple points of
view, and a diverse and inclusive contributor base.
Detecting and avoiding malicious, inappropriate, and misleading
content in community-based KBs.
Biases in community-based KBs and their impact on downstream uses of
Community-based KBs in science, medicine, law, government, or other
Handling specialized types of knowledge (such as commonsense,
probabilistic, or linguistic knowledge) in a community setting.
Methods and tools to manage KB evolution, including change
detection, change management, conflict resolution, visualization of
Tools and affordances supporting community or collaborative
activities, including discussions, feedback, decision making, task
Motivations and incentives affecting community participation.
Approaches and metrics for community health, including but not
restricted to community growth or diversity.
Roles and participation profiles in communities building and
Frameworks and approaches to support group decision-making and
Types of Papers
We invite submission of Research, Survey, Ontology, and System papers,
according to the guidelines given at https://www.jws-volumes.com
The Journal of Web Semantics solicits original scientific contributions
of high quality. Following the overall mission of the journal, we
emphasize the publication of papers that combine theories, methods and
experiments from different subject areas in order to deliver innovative
semantic methods and applications. The publication of large-scale
experiments and their analysis is also encouraged to clearly illustrate
scenarios and methods that introduce semantics into existing Web
interfaces, contents and services.
Submission of your manuscript is welcome provided that it, or any
translation of it, has not been copyrighted or published and is not
being submitted for publication elsewhere.
Manuscripts should be prepared for publication in accordance with
instructions given in the JWS guide for authors
The submission and review process will be carried out using Elsevier's
Web-based EM system
<https://www.editorialmanager.com/JOWS/default.aspx>. Please state the
name of the SI in your cover letter and, at the time of submission,
please select “VSI:CBKB” when reaching the Article Type selection.
Upon acceptance of an article, the author(s) will be asked to transfer
copyright of the article to the publisher. This transfer will ensure the
widest possible dissemination of information. Elsevier's liberalpreprint
authors and their institutions to host preprints on their web sites.
Preprints of the articles will be made freely accessible viaJWS First
Final copies of accepted publications will appear in print and at
Elsevier's archival online server.
Submission deadline: November 1, 2021
Author notification: February 7, 2022
Minor revisions due: February 21, 2022
Major revisions due: March 14, 2022
Papers appear on JWS preprint server: May 2, 2022
Publication: Fall or Winter 2022
Tim Finin is the Willard and Lillian Hackerman Chair in Engineering and
a Professor of Computer Science and Electrical Engineering at the
University of Maryland, Baltimore County (UMBC).
Sebastian Hellmann is the head of the “Knowledge Integration and
Language Technologies (KILT)" Competence Center at InfAI, Leipzig. He
also is the executive director and board member of the non-profit
DBpedia Association with over 30 key players
<https://www.dbpedia.org/members/overview/>in the knowledge graph area.
He earned a rank in AMiner’s top 10 of the most influential scholars in
knowledge engineering of the last decade.
David L. Martinis a Research & Development Scientist in Artificial
Intelligence. He has held positions at SRI International, Siri, Inc.,
Apple, Nuance Communications, Samsung Research America, and the
University of California at Santa Cruz. He is a Senior Member of the
Association for the Advancement of Artificial Intelligence, and
currently works as an independent consultant in Silicon Valley, California.
Elena Simperlis professor of computer science at King’s College London,
a Fellow of the British Computer Society and former Turing fellow.
According to AMiner, she is in the top 100 most influential scholars in
knowledge engineering of the last decade, as well as in the Women in AI
2000 ranking. Before joining King’s College, she held positions at the
University of Southampton, as well as in Germany and Austria.
Quick reminder: the Wikidata & Wikibase office hours is happening later
today at 18:00 CEST.
See you shortly!
On Mon, Jul 12, 2021 at 9:00 AM Mohammed Sadat Abdulai <
> The next Wikidata+Wikibase office hours will take place on Wednesday 28th
> July 2021 at 16:00 UTC (18:00 Berlin time) in the Wikidata Telegram group
> *The Wikidata and Wikibase office hours are online events where the
> development team present what we have been working on over the past
> quarter, and the community is welcome to ask questions and discuss
> important issues related to the development of Wikidata and Wikibase.*
> This month we will also be having a guest presentation about Toolhub
> <https://meta.wikimedia.org/wiki/Toolhub> by Srishti Sethi from the
> Wikimedia Foundation.
> Looking forward to seeing you all.
> Mohammed Sadat
> *Community Communications Manager for Wikidata/Wikibase*
> Wikimedia Deutschland e. V. | Tempelhofer Ufer 23-24 | 10963 Berlin
> Phone: +49 (0)30 219 158 26-0
> Keep up to date! Current news and exciting stories about Wikimedia,
> Wikipedia and Free Knowledge in our newsletter (in German): Subscribe now
> Imagine a world in which every single human being can freely share in the
> sum of all knowledge. Help us to achieve our vision!
> Wikimedia Deutschland – Gesellschaft zur Förderung Freien Wissens e. V.
> Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
> der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
> Körperschaften I Berlin, Steuernummer 27/029/42207.
Hi Community! And specifically probably Denny?
Did Wikidata ever explore directly storing or having the ability to store
weak/strong subclasses as in Prototype theory
A Guppy <https://www.wikidata.org/wiki/Q178202> is:
not typical of PET
typical of PET FISH
typical of FISH
Or was the decision to simply allow usage of all kinds of classification
schemes and taxonomies simply through the direct use of introduced
properties like SKOS:related etc. ?
It seems it was the latter?
But I couldn't find where some of those initial modeling decisions were
made and written down.
Perhaps they never were written but only voiced by Denny and others in the
early stages of design?
I poked around through Wikidata Events and even noticeboards, Project chat,
etc. but could not find some of those very, very early design decisions.
It would be nice to have a place to look with a link to a page in the
Community portal that says "History of Wikidata's design and early
collected meetings, notes, design documents, recordings"
(Concept Disambiguation (Pseudo Sense Morphs) are my basis to understand
more about some of the underpinning designs of the data model and knowledge
organization, since I have tons of really great ideas forming. With
potential needs later on to do some strong concept mapping <-> sense in
lots of ways, like classifier systems that borrow from computer vision
research such as "shape dictionaries" against Abstract Wikipedia Renderers
and Functions, with potential hints for Constructors, as it begins to be a
realized part of knowledge exchange between communities)