Hello all!
A quick update on what's going on around our SPARQL endpoints.
* Wikimedia Commons Query Service (WCQS) [1] is available as a beta service. We've seen a number of people starting to run queries. And a number of examples have been added [2]. Thanks all for your help! * We are focusing again on WDQS and improving the update process [3]. So far, we have an end-to-end working example for simple updates (revision create) and are working on adding support for more complex updates (deletes, undeletes, suppressed deletes, etc...). Once this all process is complete and working for WDQS, we'll see how we can adapt it for WCQS and have streaming updates to WCQS. * We are looking into the deployment constraints for the new WDQS update process. Managing Flink at scale is non trivial, we are just starting, but there is a lot more work to make this robust. * We are planning to spend more time doing some analytics on our data [4]. We want to better understand the use cases and the data we have. We are still defining exactly what question we want to answer from the data, but the main ones are ** What are the most expensive queries, what are they trying to achieve and is that reasonable ** Do we have performant subgraphs that we could expose indepently. This will also require some work to improve our query logging and aggregate more context with the queries we log.
That's all for today!
Have fun!
Guillaume
[1] https://wcqs-beta.wmflabs.org/ [2] https://commons.wikimedia.org/wiki/Commons:SPARQL_query_service/queries/exam... [3] https://phabricator.wikimedia.org/T244590 [4] https://phabricator.wikimedia.org/T257045
Hoi, When will the Commons search in any and all languages be marketed as available for any and all languages.
The WDQS service will mostly benefit people who speak English and use the English data. When other languages are a priority, Special:MediaSearch will get a much bigger audience. It will have more of an impact on Wikipedia articles. It will make Commons also much more usable. Thanks, GerardM
On Wed, 2 Sep 2020 at 11:15, Guillaume Lederrey glederrey@wikimedia.org wrote:
Hello all!
A quick update on what's going on around our SPARQL endpoints.
- Wikimedia Commons Query Service (WCQS) [1] is available as a beta
service. We've seen a number of people starting to run queries. And a number of examples have been added [2]. Thanks all for your help!
- We are focusing again on WDQS and improving the update process [3]. So
far, we have an end-to-end working example for simple updates (revision create) and are working on adding support for more complex updates (deletes, undeletes, suppressed deletes, etc...). Once this all process is complete and working for WDQS, we'll see how we can adapt it for WCQS and have streaming updates to WCQS.
- We are looking into the deployment constraints for the new WDQS update
process. Managing Flink at scale is non trivial, we are just starting, but there is a lot more work to make this robust.
- We are planning to spend more time doing some analytics on our data [4].
We want to better understand the use cases and the data we have. We are still defining exactly what question we want to answer from the data, but the main ones are ** What are the most expensive queries, what are they trying to achieve and is that reasonable ** Do we have performant subgraphs that we could expose indepently. This will also require some work to improve our query logging and aggregate more context with the queries we log.
That's all for today!
Have fun!
Guillaume
[1] https://wcqs-beta.wmflabs.org/ [2] https://commons.wikimedia.org/wiki/Commons:SPARQL_query_service/queries/exam... [3] https://phabricator.wikimedia.org/T244590 [4] https://phabricator.wikimedia.org/T257045 -- Guillaume Lederrey Engineering Manager, Search Platform Wikimedia Foundation UTC+1 / CET _______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
On Wed, Sep 2, 2020 at 5:41 AM Gerard Meijssen gerard.meijssen@gmail.com wrote:
Hoi, When will the Commons search in any and all languages be marketed as available for any and all languages.
The WDQS service will mostly benefit people who speak English and use the English data. When other languages are a priority, Special:MediaSearch will get a much bigger audience. It will have more of an impact on Wikipedia articles. It will make Commons also much more usable. Thanks, GerardM
Thanks for mentioning MediaSearch [1], Gerard. The development team plans on pushing out some new features later this month that will help demonstrate the benefits you mention, I look forward to sharing more here when we're ready for release.
1. https://commons.wikimedia.org/wiki/Commons:Structured_data/Media_search