Hello all!
The public WDQS Split Graph endpoints have been available for ~6 months, it is time to have a look at what has been happening and at the next steps.
We don’t see a strong adoption of the new endpoints (~20 req/min for query-scholary [1]). But we’ve identified almost 90% of the current requests that would require migration to the split endpoints. The large majority (~80%) are generated by a tool that is unfinished and has been dropped by its author. Those queries are already broken or don’t have value and will never be migrated. Unsurprisingly, Scholia is a major user of the scholarly subgraph and has not migrated yet.
While we want to move forward, we also want to limit disruption, and give more time to the projects that need it. To ease the transition, we’ve created a new endpoint (query-legacy-full.wikidata.org) which contains the full Wikidata graph, but is limited in terms of performances and availability [2]. This new endpoint can be used in place of the current query.wikidata.org for the few projects that need the additional migration time. This endpoint will be available until December 2025.
The next big step is to drop support for the full Wikidata graph on query.wikidata.org [3]. This should happen around April 10. After that step, requests to query.wikidata.org that require the full graph will fail or return invalid results if they are not rewritten to use SPARQL federation [4]. You can ask for help to rewrite your queries [5].
In related news, Peter [6] has been exploring the performances of various alternative RDF backends [7]. This is going to be invaluable when we work on replacing Blazegraph!
Have fun!
Guillaume
[1] https://grafana.wikimedia.org/d/000000489/wikidata-query-service?orgId=1&...
[2] https://phabricator.wikimedia.org/T384422
[3] https://phabricator.wikimedia.org/T388134
[4] https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/WDQS_graph_split...
[5] https://www.wikidata.org/wiki/Wikidata:Request_a_query
[6] https://www.wikidata.org/wiki/User:Peter_F._Patel-Schneider
[7] https://www.wikidata.org/wiki/Wikidata:Scaling_Wikidata/Benchmarking
It has been pointed out to me that the message wasn't clear enough. To make things more clear:
* We have identified one tool that generates the majority of queries requiring Scholarly articles. This tool is not maintained. * Scholia is the next largest source of queries requiring Scholarly articles, it has not migrated yet, work is in progress.
On Mon, 10 Mar 2025 at 21:14, Guillaume Lederrey glederrey@wikimedia.org wrote:
Hello all!
The public WDQS Split Graph endpoints have been available for ~6 months, it is time to have a look at what has been happening and at the next steps.
We don’t see a strong adoption of the new endpoints (~20 req/min for query-scholary [1]). But we’ve identified almost 90% of the current requests that would require migration to the split endpoints. The large majority (~80%) are generated by a tool that is unfinished and has been dropped by its author. Those queries are already broken or don’t have value and will never be migrated. Unsurprisingly, Scholia is a major user of the scholarly subgraph and has not migrated yet.
While we want to move forward, we also want to limit disruption, and give more time to the projects that need it. To ease the transition, we’ve created a new endpoint (query-legacy-full.wikidata.org) which contains the full Wikidata graph, but is limited in terms of performances and availability [2]. This new endpoint can be used in place of the current query.wikidata.org for the few projects that need the additional migration time. This endpoint will be available until December 2025.
The next big step is to drop support for the full Wikidata graph on query.wikidata.org [3]. This should happen around April 10. After that step, requests to query.wikidata.org that require the full graph will fail or return invalid results if they are not rewritten to use SPARQL federation [4]. You can ask for help to rewrite your queries [5].
In related news, Peter [6] has been exploring the performances of various alternative RDF backends [7]. This is going to be invaluable when we work on replacing Blazegraph!
Have fun!
Guillaume
[1] https://grafana.wikimedia.org/d/000000489/wikidata-query-service?orgId=1&...
[2] https://phabricator.wikimedia.org/T384422
[3] https://phabricator.wikimedia.org/T388134
[4] https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/WDQS_graph_split...
[5] https://www.wikidata.org/wiki/Wikidata:Request_a_query
[6] https://www.wikidata.org/wiki/User:Peter_F._Patel-Schneider
[7] https://www.wikidata.org/wiki/Wikidata:Scaling_Wikidata/Benchmarking
-- *Guillaume Lederrey* (he/him) Engineering Manager Wikimedia Foundation https://wikimediafoundation.org/
Any particular reason why Apache Jena (likely the most compatible and up-to-date SPARQL implementation) is not included in the benchmark tests/evaluation by [6]?
Best, Marco
On Mon, Mar 10, 2025 at 8:15 PM Guillaume Lederrey glederrey@wikimedia.org wrote:
Hello all!
The public WDQS Split Graph endpoints have been available for ~6 months, it is time to have a look at what has been happening and at the next steps.
We don’t see a strong adoption of the new endpoints (~20 req/min for query-scholary [1]). But we’ve identified almost 90% of the current requests that would require migration to the split endpoints. The large majority (~80%) are generated by a tool that is unfinished and has been dropped by its author. Those queries are already broken or don’t have value and will never be migrated. Unsurprisingly, Scholia is a major user of the scholarly subgraph and has not migrated yet.
While we want to move forward, we also want to limit disruption, and give more time to the projects that need it. To ease the transition, we’ve created a new endpoint (query-legacy-full.wikidata.org) which contains the full Wikidata graph, but is limited in terms of performances and availability [2]. This new endpoint can be used in place of the current query.wikidata.org for the few projects that need the additional migration time. This endpoint will be available until December 2025.
The next big step is to drop support for the full Wikidata graph on query.wikidata.org [3]. This should happen around April 10. After that step, requests to query.wikidata.org that require the full graph will fail or return invalid results if they are not rewritten to use SPARQL federation [4]. You can ask for help to rewrite your queries [5].
In related news, Peter [6] has been exploring the performances of various alternative RDF backends [7]. This is going to be invaluable when we work on replacing Blazegraph!
Have fun!
Guillaume
[1] https://grafana.wikimedia.org/d/000000489/wikidata-query-service?orgId=1&...
[2] https://phabricator.wikimedia.org/T384422
[3] https://phabricator.wikimedia.org/T388134
[4] https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/WDQS_graph_split...
[5] https://www.wikidata.org/wiki/Wikidata:Request_a_query
[6] https://www.wikidata.org/wiki/User:Peter_F._Patel-Schneider
[7] https://www.wikidata.org/wiki/Wikidata:Scaling_Wikidata/Benchmarking
-- *Guillaume Lederrey* (he/him) Engineering Manager Wikimedia Foundation https://wikimediafoundation.org/ _______________________________________________ Wikidata-tech mailing list -- wikidata-tech@lists.wikimedia.org To unsubscribe send an email to wikidata-tech-leave@lists.wikimedia.org
Hi Marco,
I'm not really and dev and I didn't follow Jena last developments but in previous benchmarks (like https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/WDQS_backend_upd...), Jena was tested and had problems on scaling and containing the full graph of Wikidata (which is the more painfull points right now), also it was quite slow. Has it changed? If so, it could be worth testing it again.
Cheers, Nicolas
Le mar. 11 mars 2025 à 11:25, Marco Neumann marco.neumann@gmail.com a écrit :
Any particular reason why Apache Jena (likely the most compatible and up-to-date SPARQL implementation) is not included in the benchmark tests/evaluation by [6]?
Best, Marco
On Mon, Mar 10, 2025 at 8:15 PM Guillaume Lederrey < glederrey@wikimedia.org> wrote:
Hello all!
The public WDQS Split Graph endpoints have been available for ~6 months, it is time to have a look at what has been happening and at the next steps.
We don’t see a strong adoption of the new endpoints (~20 req/min for query-scholary [1]). But we’ve identified almost 90% of the current requests that would require migration to the split endpoints. The large majority (~80%) are generated by a tool that is unfinished and has been dropped by its author. Those queries are already broken or don’t have value and will never be migrated. Unsurprisingly, Scholia is a major user of the scholarly subgraph and has not migrated yet.
While we want to move forward, we also want to limit disruption, and give more time to the projects that need it. To ease the transition, we’ve created a new endpoint (query-legacy-full.wikidata.org) which contains the full Wikidata graph, but is limited in terms of performances and availability [2]. This new endpoint can be used in place of the current query.wikidata.org for the few projects that need the additional migration time. This endpoint will be available until December 2025.
The next big step is to drop support for the full Wikidata graph on query.wikidata.org [3]. This should happen around April 10. After that step, requests to query.wikidata.org that require the full graph will fail or return invalid results if they are not rewritten to use SPARQL federation [4]. You can ask for help to rewrite your queries [5].
In related news, Peter [6] has been exploring the performances of various alternative RDF backends [7]. This is going to be invaluable when we work on replacing Blazegraph!
Have fun!
Guillaume
[1] https://grafana.wikimedia.org/d/000000489/wikidata-query-service?orgId=1&...
[2] https://phabricator.wikimedia.org/T384422
[3] https://phabricator.wikimedia.org/T388134
[4] https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/WDQS_graph_split...
[5] https://www.wikidata.org/wiki/Wikidata:Request_a_query
[6] https://www.wikidata.org/wiki/User:Peter_F._Patel-Schneider
[7] https://www.wikidata.org/wiki/Wikidata:Scaling_Wikidata/Benchmarking
-- *Guillaume Lederrey* (he/him) Engineering Manager Wikimedia Foundation https://wikimediafoundation.org/ _______________________________________________ Wikidata-tech mailing list -- wikidata-tech@lists.wikimedia.org To unsubscribe send an email to wikidata-tech-leave@lists.wikimedia.org
--
Marco Neumann
Wikidata-tech mailing list -- wikidata-tech@lists.wikimedia.org To unsubscribe send an email to wikidata-tech-leave@lists.wikimedia.org
wikidata-tech@lists.wikimedia.org