[Wikidata] WDQS / WCQS Status update

2 Sep 2020


      Hello all!
A quick update on what's going on around our SPARQL endpoints.
* Wikimedia Commons Query Service (WCQS) [1] is available as a beta
service. We've seen a number of people starting to run queries. And a
number of examples have been added [2]. Thanks all for your help!
* We are focusing again on WDQS and improving the update process [3]. So
far, we have an end-to-end working example for simple updates (revision
create) and are working on adding support for more complex updates
(deletes, undeletes, suppressed deletes, etc...). Once this all process
is complete and working for WDQS, we'll see how we can adapt it for WCQS
and have streaming updates to WCQS.
* We are looking into the deployment constraints for the new WDQS update
process. Managing Flink at scale is non trivial, we are just starting, but
there is a lot more work to make this robust.
* We are planning to spend more time doing some analytics on our data [4].
We want to better understand the use cases and the data we have. We are
still defining exactly what question we want to answer from the data, but
the main ones are
** What are the most expensive queries, what are they trying to achieve and
is that reasonable
** Do we have performant subgraphs that we could expose indepently.
This will also require some work to improve our query logging and aggregate
more context with the queries we log.
That's all for today!
Have fun!
Guillaume
[1] https://wcqs-beta.wmflabs.org/
[2]
https://commons.wikimedia.org/wiki/Commons:SPARQL_query_service/queries/exam...
[3] https://phabricator.wikimedia.org/T244590
[4] https://phabricator.wikimedia.org/T257045
-- 
Guillaume Lederrey
Engineering Manager, Search Platform
Wikimedia Foundation
UTC+1 / CET

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

[Wikidata] WDQS / WCQS Status update