Hello all!
A quick update on what's going on around our SPARQL endpoints.
* Wikimedia Commons Query Service (WCQS) [1] is available as a beta service. We've seen a number of people starting to run queries. And a number of examples have been added [2]. Thanks all for your help!
* We are focusing again on WDQS and improving the update process [3]. So far, we have an end-to-end working example for simple updates (revision create) and are working on adding support for more complex updates (deletes, undeletes, suppressed deletes, etc...). Once this all process is complete and working for WDQS, we'll see how we can adapt it for WCQS and have streaming updates to WCQS.
* We are looking into the deployment constraints for the new WDQS update process. Managing Flink at scale is non trivial, we are just starting, but there is a lot more work to make this robust.
* We are planning to spend more time doing some analytics on our data [4]. We want to better understand the use cases and the data we have. We are still defining exactly what question we want to answer from the data, but the main ones are
** What are the most expensive queries, what are they trying to achieve and is that reasonable
** Do we have performant subgraphs that we could expose indepently.
This will also require some work to improve our query logging and aggregate more context with the queries we log.
That's all for today!
Have fun!
Guillaume
--
Guillaume Lederrey
Engineering Manager, Search Platform
Wikimedia Foundation
UTC+1 / CET