Hello all,
*The WikidataCon <https://www.wikidata.org/wiki/Wikidata:WikidataCon_2021>,
the conference dedicated to the Wikidata community* and co-organized by
Wikimedia Deutschland and Wiki Movimento Brasil, is taking place online on
October 29-31, around the theme “a sustainable future for Wikidata”. In
this update, we are very excited to share with you the first announcements
about the program of the event!
On October 29th, the first day of the conference, you will discover a *program
of keynotes and round table discussions* carefully curated by the
organizing team. Knowledge equity, Abstract Wikipedia, the technical
challenges of the Query Service, and an overview on how Brazilian cultural
institutions use Wikidata, are some of the topics that will be presented
during this first day. We will also have various celebrations related to
Wikidata’s anniversary that will allow us to celebrate achievements of the
community and discover new projects. You can have an overview of the
program of Day 1 here.
<https://www.wikidata.org/wiki/Wikidata:WikidataCon_2021/Program/Day_1_-_Mai…>
On the next two days of the conference, the program will be made of
*presentations,
discussions and workshops coming directly from the participants*. We will
have several thematic tracks split during three time slots, to allow people
from various time zones to attend. The content of these tracks will be
proposed by the Wikidata community. This process will start at the
beginning of October. More information will be added on this page in the
next few weeks.
To support us in running this very ambitious event, five organisations are
sponsoring the WikidataCon. You will find a description of the sponsors here
<https://www.wikidata.org/wiki/Wikidata:WikidataCon_2021/Organization/Sponso…>.
Some of these organizations are already benefiting from the amazing content
of Wikidata, and their contribution to the conference is one of the ways
they can engage more with the community and support the future of the
project.
October is also the celebration of *Wikidata’s ninth anniversary*, as the
project was officially started on October 29th, 2012. Every year, community
members work together on birthday presents that will be showcased during
the WikidataCon. You will find the ninth birthday page here
<https://www.wikidata.org/wiki/Wikidata:Ninth_Birthday>, and you can
already think about a birthday present you would like to prepare.
Last but not least: the registration for the conference will start on
September 15th. More information will be added on the Attend page
<https://www.wikidata.org/wiki/Wikidata:WikidataCon_2021/Attend> at the
time.
Thanks for reading this update! You can find out more about the WikidataCon
on this page <https://www.wikidata.org/wiki/Wikidata:WikidataCon_2021>, and
read the previous updates on the talk page
<https://www.wikidata.org/wiki/Wikidata_talk:WikidataCon_2021>. If you have
any questions or suggestions, feel free to reach out to us by leaving a
comment on the talk page
<https://www.wikidata.org/wiki/Wikidata_talk:WikidataCon_2021> or
contacting info[image: at]wikidatacon.org.
For the WikidataCon organization team,
--
Léa Lacroix
Community Engagement Coordinator
Wikimedia Deutschland e.V.
Tempelhofer Ufer 23-24
10963 Berlin
www.wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/029/42207.
** Second Call for Papers - please circulate this CFP to your colleagues
and networks **
-- 17th International Conference on Information Assurance and Security (IAS
2021) --
http://www.mirlabs.org/ias21http://www.mirlabs.net/ias21
On the World Wide Web
December 14-16,2021
Proceedings of IAS'21 will be published in Lecture Notes in Networks and
Systems, Springer (Indexed in SCOPUS, INSPEC, WTI Frankfurt eG, zbMATH,
SCImago).
<https://www.springer.com/series/15179>
IAS 2020 Proceedings: https://www.springer.com/gp/book/9783030736880
History of IAS series: http://www.mirlabs.net/ias21/previous.php
**Important Dates**
---------------------
Paper submission due: September 30, 2021
Notification of paper acceptance: October 31, 2021
Registration and Final manuscript due: November 15, 2021
Conference: December 14-16, 2021
**About IAS 2021**
---------------------
Information assurance and security have become important research issues in
the networked and distributed information sharing environments. Finding
effective ways to protect information systems, networks, and sensitive data
within the critical information infrastructure is challenging even with the
most advanced technology and trained professionals. The 17th International
Conference on Information Assurance and Security (IAS) aims to bring
together researchers, practitioners, developers, and policymakers involved
in various information assurance and security disciplines to exchange ideas
and learn the latest developments in this crucial field.
**Topics (not limited to)**
---------------------------
Information Assurance, Security Mechanisms, Methodologies and Models
Authentication and Identity
Management Authorization and Access Control
Trust Negotiation, Establishment and Management
Anonymity and User Privacy
Data Integrity and Privacy
Network Security
Operating System Security
Database Security
Intrusion Detection
Security Attacks
Security Oriented System Design
Security and Performance trade-off
Security Management and Strategy
Security Verification, Evaluations and Measurements
Secure Software Technologies
New Ideas and Paradigms for Security
Cryptography
Cryptographic Protocols
Key Management and Recovery
Secure System Architectures and Security Application
Image Engineering, Multimedia Signal Processing and Communication Security
**Submission Guidelines**
-------------------------
Submission of a paper should be made through the submission page from the
conference web page.
Please refer to the conference website for guidelines to prepare your
manuscript.
Paper format templates:
https://www.springer.com/de/authors-editors/book-authors-editors/manuscript…
IAS’21 Submission Link:
https://easychair.org/conferences/?conf=ias2021
**Plenary Talks**
-----------------------------
You are welcome to attend 8 Keynote Talks offered by world-renowned
professors and industry leaders. The detailed information is available on
the conference website.
Speaker 1: Antônio de Padua Braga, Federal University of Minas Gerais,
Brazil
Title: Large margin classification with graph-based models
Speaker 2: Juergen Branke, University of Warwick, Coventry, United Kingdom
Title: Learning to optimise – optimal learning
Speaker 3: Oscar Cordon, University of Granada, Spain
Title: Hybrid Intelligent Systems for Forensic Anthropology and Human
Identification
Speaker 4: Kalyanmoy Deb, Michigan State University, USA
Title: TBA
Speaker 5: Andries Engelbrecht, University of Stellenbosch, South Africa
Title: Set-based Particle Swarm Optimization Approach to Portfolio
Optimization
Speaker 6: Frédéric GUINAND, Le Havre Normandy University, France
Title: Swarms of Unmanned Aerial Vehicles
Speaker 7: Cengiz Toklu, Beykent University, Istanbul, Turkey
Title: Hybrid Algorithms. Applications to Structural Mechanics
Speaker 8: Günther Raidl, Technische Universität Wien, Austria
Title: Combinatorial Optimization Meets (Reinforcement) Learning
** IAS 2021 Organization **
------------------------------
General Chairs
Ajith Abraham, Machine Intelligence Research Labs (MIR Labs), USA
Vincenzo Piuri, University of Milan, Italy
Patrick Siarry, Université Paris-Est Créteil, France
Program Chairs
Oscar Castillo, Tijuana Institute of Technology, Mexico
Patrick Hung, Ontario Tech University, Canada
**Technical Contact**
---------------------
Dr. Ajith Abraham
Email: ajith.abraham(a)ieee.org
Thanks and regards
Publicity Chair IAS 2021
Anu Bajaj
*Research Associate:* *Machine Intelligence Research Labs (MIR Labs)*
<https://www.mirlabs.org/>
Scientific Network for Innovation and Research Excellence
11, 3rd Street NW
P.O. Box 2259
Auburn, Washington 98071, USA
Hi all,
Join the Research Team at the Wikimedia Foundation [1] for their monthly
Office hours next Tuesday, 2021-09-07, at 16:00-17:00 UTC (9am PT/6pm
CEST).
To participate, join the video-call via this link [2]. There is no set
agenda - feel free to add your item to the list of topics in the etherpad
[3] (You can do this after you join the meeting, too.), otherwise you are
welcome to also just hang out. More detailed information (e.g. about how to
attend) can be found here [4].
Through these office hours, we aim to make ourselves more available to
answer some of the research related questions that you as Wikimedia
volunteer editors, organizers, affiliates, staff, and researchers face in
your projects and initiatives. Some example cases we hope to be able to
support you in:
-
You have a specific research related question that you suspect you
should be able to answer with the publicly available data and you don’t
know how to find an answer for it, or you just need some more help with it.
For example, how can I compute the ratio of anonymous to registered editors
in my wiki?
-
You run into repetitive or very manual work as part of your Wikimedia
contributions and you wish to find out if there are ways to use machines to
improve your workflows. These types of conversations can sometimes be
harder to find an answer for during an office hour, however, discussing
them can help us understand your challenges better and we may find ways to
work with each other to support you in addressing it in the future.
-
You want to learn what the Research team at the Wikimedia Foundation
does and how we can potentially support you. Specifically for affiliates:
if you are interested in building relationships with the academic
institutions in your country, we would love to talk with you and learn
more. We have a series of programs that aim to expand the network of
Wikimedia researchers globally and we would love to collaborate with those
of you interested more closely in this space.
-
You want to talk with us about one of our existing programs [5].
Hope to see many of you,
Martin on behalf of the WMF Research Team
[1] https://research.wikimedia.org
[2] https://meet.jit.si/WMF-Research-Office-Hours
[3] https://etherpad.wikimedia.org/p/Research-Analytics-Office-hours
[4] https://www.mediawiki.org/wiki/Wikimedia_Research/Office_hours
[5] https://research.wikimedia.org/projects.html
--
Martin Gerlach
Research Scientist
Wikimedia Foundation
The Journal of Web Semantics (JWS) invites submissions for a special
issue on Community-based Knowledge Bases and Knowledge Graphs, edited by
Tim Finin, Sebastian Hellmann, David Martin, and Elena Simperl. (contact
email: cbkb(a)cs.umbc.edu <mailto:cbkb@cs.umbc.edu>) Submissions are due
by November 01, 2021. Please see the JWS post here:
http://www.websemanticsjournal.org/2021/06/cfp-community-based-knowledge-ba…
<http://www.websemanticsjournal.org/2021/06/cfp-community-based-knowledge-ba…>
Introduction
Community-based knowledge bases (KBs) and knowledge graphs (KGs) are
critical to many domains. They contain large amounts of information,
used in applications as diverse as search, question-answering systems,
and conversational agents. They are the backbone of linked open data,
helping connect entities from different datasets. Finally, they create
rich knowledge engineering ecosystems, making significant, empirical
contributions to our understanding of KB/KG science, engineering, and
practices. From here forward, we use "KB" to include both knowledge
bases and knowledge graphs. Also, "KB" and "knowledge" encompass both
ontology/schema and data.
Community-based KBs come in many shapes and sizes, but they tend to
share a number of commonalities:
*
They are created through the efforts of a group of contributors,
following a set of agreed goals, policies, practices, and quality norms.
*
They are available under open licenses.
*
They are central to knowledge-sharing networks bringing together
various stakeholders.
*
They serve the needs of a community of users, including, but not
restricted to, their contributor base.
*
Many draw their content from crowdsourced resources (such as
Wikipedia, OpenStreetMap).
Examples of community-based KBs include Wikidata, DBpedia, ConceptNet,
GeoNames, FrameNet, and Yago. This special issue will highlight recent
research, challenges, and opportunities in the field of community-based
KBs and the interaction and processes between stakeholders and the KBs.
We welcome papers on a wide variety of topics. Papers that focus on the
participation of a community of contributors are especially encouraged.
Topics of interest
We are looking for studies, frameworks, methods, techniques and tools on
topics such as the following:
*
The impact of community involvement on characteristics of KBs such
as requirements, design, technology choices, policies, etc. For
example, how are KB characteristics driven by the community and
reflective of the community's needs?
*
Conversely, the impact of KB characteristics on community
involvement. For example, how do changes in these characteristics
affect the participation and behavior of members of the community?
*
Organizational challenges and solutions in developing and managing
community-based KBs.
*
Technical challenges and solutions in community-based KBs,
concerning a technical area such as:
o
Representation of knowledge and logical foundations
o
Reasoning, querying, and constraint-checking
o
Knowledge acquisition
o
Knowledge preparation (e.g., cleaning, deduplication, alignment,
merging)
o
Maintaining consistency with external sources
o
Representing and managing metadata (including issues involved in
adding metadata to relation instances)
o
Provenance
o
Quality assurance
*
User interfaces and experience, both for contributing to the KB and
using it, by different user groups.
*
Implemented metrics and quality tests to guide the community in
improving KG quality and expanding KG coverage.
*
Achieving and managing knowledge diversity, for instance, in the
form of multilinguality, multi-cultural coverage, multiple points of
view, and a diverse and inclusive contributor base.
*
Detecting and avoiding malicious, inappropriate, and misleading
content in community-based KBs.
*
Biases in community-based KBs and their impact on downstream uses of
KB content.
*
Community-based KBs in science, medicine, law, government, or other
domains.
*
Handling specialized types of knowledge (such as commonsense,
probabilistic, or linguistic knowledge) in a community setting.
*
Methods and tools to manage KB evolution, including change
detection, change management, conflict resolution, visualization of
change history.
*
Tools and affordances supporting community or collaborative
activities, including discussions, feedback, decision making, task
allocation, etc.
*
Motivations and incentives affecting community participation.
*
Approaches and metrics for community health, including but not
restricted to community growth or diversity.
*
Roles and participation profiles in communities building and
maintaining KBs.
*
Frameworks and approaches to support group decision-making and
resolve conflicts.
Types of Papers
We invite submission of Research, Survey, Ontology, and System papers,
according to the guidelines given at https://www.jws-volumes.com
<https://www.jws-volumes.com/>.
Submission Guidelines
The Journal of Web Semantics solicits original scientific contributions
of high quality. Following the overall mission of the journal, we
emphasize the publication of papers that combine theories, methods and
experiments from different subject areas in order to deliver innovative
semantic methods and applications. The publication of large-scale
experiments and their analysis is also encouraged to clearly illustrate
scenarios and methods that introduce semantics into existing Web
interfaces, contents and services.
Submission of your manuscript is welcome provided that it, or any
translation of it, has not been copyrighted or published and is not
being submitted for publication elsewhere.
Manuscripts should be prepared for publication in accordance with
instructions given in the JWS guide for authors
<http://www.elsevier.com/journals/journal-of-web-semantics/1570-8268/guide-f…>.
The submission and review process will be carried out using Elsevier's
Web-based EM system
<https://www.editorialmanager.com/JOWS/default.aspx>. Please state the
name of the SI in your cover letter and, at the time of submission,
please select “VSI:CBKB” when reaching the Article Type selection.
Upon acceptance of an article, the author(s) will be asked to transfer
copyright of the article to the publisher. This transfer will ensure the
widest possible dissemination of information. Elsevier's liberalpreprint
policy<https://www.elsevier.com/authors/journal-authors/submit-your-paper/sharing-…>permits
authors and their institutions to host preprints on their web sites.
Preprints of the articles will be made freely accessible viaJWS First
Look
<https://papers.ssrn.com/sol3/JELJOUR_Results.cfm?form_name=journalbrowse&jo…>.
Final copies of accepted publications will appear in print and at
Elsevier's archival online server.
Important Dates
*
Submission deadline: November 1, 2021
*
Author notification: February 7, 2022
*
Minor revisions due: February 21, 2022
*
Major revisions due: March 14, 2022
*
Papers appear on JWS preprint server: May 2, 2022
*
Publication: Fall or Winter 2022
Guest Editors
Tim Finin is the Willard and Lillian Hackerman Chair in Engineering and
a Professor of Computer Science and Electrical Engineering at the
University of Maryland, Baltimore County (UMBC).
Sebastian Hellmann is the head of the “Knowledge Integration and
Language Technologies (KILT)" Competence Center at InfAI, Leipzig. He
also is the executive director and board member of the non-profit
DBpedia Association with over 30 key players
<https://www.dbpedia.org/members/overview/>in the knowledge graph area.
He earned a rank in AMiner’s top 10 of the most influential scholars in
knowledge engineering of the last decade.
David L. Martinis a Research & Development Scientist in Artificial
Intelligence. He has held positions at SRI International, Siri, Inc.,
Apple, Nuance Communications, Samsung Research America, and the
University of California at Santa Cruz. He is a Senior Member of the
Association for the Advancement of Artificial Intelligence, and
currently works as an independent consultant in Silicon Valley, California.
Elena Simperlis professor of computer science at King’s College London,
a Fellow of the British Computer Society and former Turing fellow.
According to AMiner, she is in the top 100 most influential scholars in
knowledge engineering of the last decade, as well as in the Women in AI
2000 ranking. Before joining King’s College, she held positions at the
University of Southampton, as well as in Germany and Austria.
The International Workshop on Archives and Linked Data (
https://linkedarchives.inesctec.pt/) aims to gather researchers and
specialists engaged in initiatives that cross Archives and the Semantic Web
and those planning similar initiatives in cultural heritage organizations.
The workshop is an online event, run in conjunction with the 25th
International Conference on Theory and Practice of Digital Libraries (TPDL
2021) (http://www.tpdl.eu/tpdl2021/) on September 13th.
Registration is free of cost. Participation is open to all, but
registration is mandatory:
https://www.eventbrite.co.uk/e/theory-and-practice-of-digital-libraries-tpd….
The workshop’s program is available at:
https://linkedarchives.inesctec.pt/#program.
Looking forward to your participation,
The workshop chairs
Hello all,
The *Data Quality Days
<https://www.wikidata.org/wiki/Wikidata:Events/Data_Quality_Days_2021>*,
eight days of community events focused on data quality, are starting on
Wednesday, September 8th, and there are plenty of very interesting sessions
taking place! Many thanks to everyone who proposed a topic and took the
lead to facilitate a discussion or a workshop.
Here are a few examples of events that will take place (you can see the
full schedule here
<https://www.wikidata.org/wiki/Wikidata:Events/Data_Quality_Days_2021#Events>
):
- Data quality: what is it and why is it important?
- *EntitySchemas* and Shape Expressions on Wikidata
- How can we incorporate feedback from our biggest data re-users at
scale?
- Integration of the Czech authority files with Wikidata
- Editing livestream and editathon on *property constraints*
- Editathon to improve Help:Ranking and Help:Deduplication
- Overview of the *ontology issues* on Wikidata
- Increasing data quality by increasing Wikidata's visibility on other
wikis
- Bug Triage Hour on *data quality and maintenance*
- Presentations of various *patrolling and quality tools* (ORES, ProWD,
Item Quality Evaluator, SpeedPatrolling, Constraint Violation Checker,
WikidataComplete...)
The events are taking place online (most of them on the open source tool
Jitsi). Most of them are not recorded but notes will be taken. No
registration is needed, you can simply add your name to the participants
list
<https://www.wikidata.org/wiki/Wikidata:Events/Data_Quality_Days_2021/Partic…>
if
you want to tell others that you are interested. It is still possible to
add sessions in the schedule if you want to take care of a discussion or
workshop. There's also a Phabricator board
<https://phabricator.wikimedia.org/project/board/5504/> where you can add
the projects you plan to work on.
We're looking forward to discuss data quality and tools with you! If you
have any questions, feel free to reach out to me.
Cheers,
--
Léa Lacroix
Community Engagement Coordinator
Wikimedia Deutschland e.V.
Tempelhofer Ufer 23-24
10963 Berlin
www.wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/029/42207.
** Second Call for Papers - please circulate this CFP to your colleagues
and networks **
-- The 12th International Conference on Innovations in Bio-Inspired
Computing and Applications (IBICA'21) --
http://www.mirlabs.net/ibica21http://www.mirlabs.org/ibica21
On the World Wide Web
December 16-18,2021
Proceedings of IBICA'21 will be published in Lecture Notes in Networks and
Systems, Springer (Indexed in SCOPUS, INSPEC, WTI Frankfurt eG, zbMATH,
SCImago).
<https://www.springer.com/series/15179>
Proceedings of IBICA'2020: https://www.springer.com/gp/book/9783030736026
History of IBICA series: http://www.mirlabs.net/ibica21/previous.php
**Important Dates**
---------------------
Paper submission due: September 30, 2021
Notification of paper acceptance: October 31, 2021
Registration and Final manuscript due: November 15, 2021
Conference: December 16-18, 2021
**About IBICA 2021**
------------------
IBICA'21 is the 12th International Conference on Innovations in
Bio-Inspired Computing and Applications. The aim of IBICA is to provide a
platform for world research leaders and practitioners to discuss the
"full-spectrum" of current theoretical developments, emerging technologies,
and innovative applications of Bio-inspired Computing. Bio-inspired
Computing is currently one of the most exciting research areas, and it is
continuously demonstrating exceptional strength in solving complex
real-life problems. The main driving force of the conference is to explore
further the intriguing potential of Bio-inspired Computing. IBICA 2021 will
be held online.
**Topics (not limited to)**
---------------------------
Bio-inspired Computing
Ant Colony System
Artificial Immune Systems
Artificial Intelligence
Artificial Neural Networks
Cellular Automaton
Cognitive Modeling
DNA Computing
Differential Evolution
Emergent Systems
Evolutionary Computations
Evolutionary Strategies/Programming
Fuzzy Logic
Genetic Algorithms/Programming
Granular Computing
Neutrosophic Systems
Organic Computing
P Systems
Particle Swarm Optimization
Emerging Technologies & Applications
Intelligent Distributed and High-Perfomance Architecture
**Submission Guidelines**
-------------------------
Submission of paper should be made through the submission page from the
conference web page.
Please refer to the conference website for guidelines to prepare your
manuscript.
Paper format templates:
https://www.springer.com/de/authors-editors/book-authors-editors/manuscript…
IBICA’21 Submission Link:
https://easychair.org/conferences/?conf=ibica2021
**Plenary Talks**
-----------------------------
You are welcome to attend 8 Keynote Talks offerred by world-renowned
professors and industry leaders. The detailed information is available on
conference website.
Speaker 1: Antônio de Padua Braga, Federal University of Minas Gerais,
Brazil
Title: Large margin classification with graph-based models
Speaker 2: Juergen Branke, University of Warwick, Coventry, United Kingdom
Title: Learning to optimise – optimal learning
Speaker 3: Oscar Cordon, University of Granada, Spain
Title: Hybrid Intelligent Systems for Forensic Anthropology and Human
Identification
Speaker 4: Kalyanmoy Deb, Michigan State University, USA
Title: TBA
Speaker 5: Andries Engelbrecht, University of Stellenbosch, South Africa
Title: Set-based Particle Swarm Optimization Approach to Portfolio
Optimization
Speaker 6: Frédéric GUINAND, Le Havre Normandy University, France
Title: Swarms of Unmanned Aerial Vehicles
Speaker 7: Cengiz Toklu, Beykent University, Istanbul, Turkey
Title: Hybrid Algorithms. Applications to Structural Mechanics
Speaker 8: Günther Raidl, Technische Universität Wien, Austria
Title: Combinatorial Optimization Meets (Reinforcement) Learning
** IBICA 2021 Organization **
-------------------------
General Chairs
Ajith Abraham, Machine Intelligence Research Labs (MIR Labs), USA
Ana Maria Madureira, Instituto Superior de Engenharia do Porto, Portugal
Arturas Kaklauskas, Vilnius Gediminas Technical University, Lithuania
Program Chairs
Dalia Kriksciuniene, Vilnius University, Lithuania
João Carlos Ferreira, ISCTE-IUL, Portugal
Azah Kamilah Muda, Universiti Teknikal Malaysia Melaka, Malaysia
**Technical Contact**
---------------------
Dr. Ajith Abraham
Email: ajith.abraham(a)ieee.org
Thanks and regards
Publicity Chair IBICA 2021
Anu Bajaj
*Research Associate:* *Machine Intelligence Research Labs (MIR Labs)*
<https://www.mirlabs.org/>
Scientific Network for Innovation and Research Excellence
11, 3rd Street NW
P.O. Box 2259
Auburn, Washington 98071, USA
Hi all,
At around 1PM UTC today (Sep 3) we started experiencing stability issues
with WDQS, localized (at least at the moment) to a single, of two,
datacenter. Unfortunately, we haven't been able to pinpoint the issue as of
now. We suspect that someone is running a query that affects Blazegraph -
that happened a few times in the past. Unfortunately, our usual tactics did
help us to find which one.
We are working on identifying the issue, but it's clear that this could in
a few hours bring the service down, so we are working on a quick
workaround. Since we observed the issue is only causing actual service
failures after ~2h after restart, for now we are going to introduce a
procedure that will restart servers randomly, so that uptime for each will
be at max around 1h. Only one server should be restarted at any given time.
This will cause some queries to be killed, when each of the servers is
restarted, but the alternative is worse.
We'll continue to work to find the root cause and will inform you of all of
our progress. We will also post our progress here: [1].
Regards,
Zbyszko Papierski
[1] https://phabricator.wikimedia.org/T290330
--
Zbyszko Papierski (He/Him)
Senior Software Engineer
Wikimedia Foundation <https://wikimediafoundation.org/>
Wikidata community members,
Thank you for all of your work helping Wikidata grow and improve over the
years. In the spirit of better communication, we would like to take this
opportunity to share some of the current challenges Wikidata Query Service
(WDQS) is facing, and some strategies we have for dealing with them.
WDQS currently risks failing to provide acceptable service quality due to
the following reasons:
1.
Blazegraph scaling
1.
Graph size. WDQS uses Blazegraph as our graph backend. While
Blazegraph can theoretically support 50 billion edges
<https://blazegraph.com/>, in reality Wikidata is the largest graph
we know of running on Blazegraph (~13 billion triples
<https://grafana.wikimedia.org/d/000000489/wikidata-query-service?viewPanel=…>),
and there is a risk that we will reach a size
<https://www.w3.org/wiki/LargeTripleStores#Bigdata.28R.29_.2812.7B.29>limit
of what it can realistically support
<https://phabricator.wikimedia.org/T213210>. Once Blazegraph is maxed
out, WDQS can no longer be updated. This will also break Wikidata tools
that rely on WDQS.
2.
Software support. Blazegraph is end of life software, which is no
longer actively maintained, making it an unsustainable backend
to continue
moving forward with long term.
Blazegraph maxing out in size poses the greatest risk for catastrophic
failure, as it would effectively prevent WDQS from being updated further,
and inevitably fall out of date. Our long term strategy to address this is
to move to a new graph backend that best meets our WDQS needs and is
actively maintained, and begin the migration off of Blazegraph as soon as a
viable alternative is identified <https://phabricator.wikimedia.org/T206560>
.
In the interim period, we are exploring disaster mitigation options for
reducing Wikidata’s graph size in the case that we hit this upper graph
size limit: (i) identify and delete lower priority data (e.g. labels,
descriptions, aliases, non-normalized values, etc); (ii) separate out
certain subgraphs (such as Lexemes and/or scholarly articles). This would
be a last resort scenario to keep Wikidata and WDQS running with reduced
functionality while we are able to deploy a more long-term solution.
1.
Update and access scaling
1.
Throughput. WDQS is currently trying to provide fast updates, and
fast unlimited queries for all users. As the number of SPARQL queries
grows over time
<https://www.mediawiki.org/wiki/User:MPopov_(WMF)/Wikimania_2021_Hackathon>alongside
graph updates, WDQS is struggling to sufficiently keep up
<https://grafana.wikimedia.org/d/000000489/wikidata-query-service?viewPanel=…>
in each dimension of service quality without compromising anywhere. For
users, this often leads to timed out queries.
2.
Equitable service. We are currently unable to adjust system behavior
per user/agent. As such, it is not possible to provide equitable
service to
users: for example, a heavy user could swamp WDQS enough to hinder
usability by community users.
In addition to being a querying service for Wikidata, WDQS is also part of
the edit pipeline of Wikidata (every edit on Wikidata is pushed to WDQS to
update the data there). While deploying the new Flink-based Streaming
Updater <https://phabricator.wikimedia.org/T244590> will help with
increasing throughput of Wikidata updates, there is a substantial risk that
WDQS will be unable to keep up with the combination of increased querying
and updating, resulting in more tradeoffs between update lag and querying
latency/timeouts.
In the near term, we would like to work more closely with you to determine
what acceptable trade-offs would be for preserving WDQS functionality while
we scale up Wikidata querying. In the long term, we will be conducting more
user research to better understand your needs so we can (i) optimize
querying via SPARQL and/or other methods, (ii) explore better user
management that will allow us to prevent heavy use of WDQS that does not
align with the goals of our movement and projects, and (iii) make it easier
for users to set up and run their own query services.
Though this information about the current state of WDQS may not be a total
surprise to many of you, we want to be as transparent with you as possible
to ensure that there are as few surprises as possible in the case of any
potential service disruptions/catastrophic failures, and that we can
accommodate your work as best as we can in the future evolution of WDQS. We
plan on doing a session on WDQS scaling challenges during WikidataCon this
year at the end of October.
Thanks for your understanding with these scaling challenges, and for any
feedback you have already been providing. If you have new concerns,
comments and questions, you can best reach us at this talk page
<https://www.wikidata.org/wiki/Wikidata_talk:Query_Service_scaling_update_Au…>.
Additionally, if you have not had a chance to fill out our survey
<https://docs.google.com/forms/d/e/1FAIpQLSe1H_OXQFDCiGlp0QRwP6-Z2CGCgm96MWB…>
yet, please tell us how you use the Wikidata Query Service (see privacy
statement
<https://foundation.wikimedia.org/wiki/WDQS_User_Survey_2021_Privacy_Stateme…>)!
Whether you are an occasional user or create tools, your feedback is needed
to decide our future development.
Best,
WMF Search + WMDE