Hi all!
As part of the WDQS Graph Split project,[1] we have new SPARQL endpoints available for serving the “main”[2] and “scholarly”[3] subgraphs of Wikidata.
As you might be aware we are addressing the Wikidata Query Service stability and scaling issues. We have been working on several projects to address these issues. This announcement is about one of them, the WDQS Graph Split.[1] This change will have an impact on certain uses of the Wikidata Query Service.
We are now entering a transition period until the end of February 2025. The three SPARQL endpoints will remain in place until the end of the transition. At the end of the transition, query.wikidata.org will serve the main Wikidata subgraph (without scholarly articles). The query-main and query-scholarly endpoints will continue to be available after the transition.
If you know to want more this change, please refer to the talk page on Wikidata.[4]
Have fun!
Guillaume
[1] https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/WDQS_graph_split [2] https://query-main.wikidata.org [3] https://query-scholarly.wikidata.org [4] https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/WDQS_backend_upd...
Hello Guillaume -- is the idea no longer for query.wikidata to serve federated queries across -main and -scholarly? Where will federated querying happen? SJ
On Tue, Sep 3, 2024 at 4:01 PM Guillaume Lederrey glederrey@wikimedia.org wrote:
Hi all!
As part of the WDQS Graph Split project,[1] we have new SPARQL endpoints available for serving the “main”[2] and “scholarly”[3] subgraphs of Wikidata.
As you might be aware we are addressing the Wikidata Query Service stability and scaling issues. We have been working on several projects to address these issues. This announcement is about one of them, the WDQS Graph Split.[1] This change will have an impact on certain uses of the Wikidata Query Service.
We are now entering a transition period until the end of February 2025. The three SPARQL endpoints will remain in place until the end of the transition. At the end of the transition, query.wikidata.org will serve the main Wikidata subgraph (without scholarly articles). The query-main and query-scholarly endpoints will continue to be available after the transition.
If you know to want more this change, please refer to the talk page on Wikidata.[4]
Have fun!
Guillaume
[1] https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/WDQS_graph_split [2] https://query-main.wikidata.org [3] https://query-scholarly.wikidata.org [4] https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/WDQS_backend_upd...
-- *Guillaume Lederrey* (he/him) Engineering Manager Wikimedia Foundation https://wikimediafoundation.org/ _______________________________________________ Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org To unsubscribe send an email to wikitech-l-leave@lists.wikimedia.org https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
Hello Samuel!
Thanks for the question! The explanation was probably not clear enough on the wiki page [1]. I'll see if I can reword it in a way that makes more sense. In the meantime, let's see if I can make things more clear here:
For the transition period, we want to limit as much as possible the impact on anyone, so query.wikidata.org will continue serving the full graph. The 2 new endpoints are created to serve the "main" and "scholarly" subgraphs. All those endpoints can be federated together. After the transition period, query.wikidata.org and query-main.wikidata.org will point to the same dataset. Some of that is explained in the Federation Guide [2].
I hope that helps!
Guillaume
[1] https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/WDQS_backend_upd... [2] https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/WDQS_graph_split... ?
On Wed, 4 Sept 2024 at 06:30, Samuel Klein meta.sj@gmail.com wrote:
Hello Guillaume -- is the idea no longer for query.wikidata to serve federated queries across -main and -scholarly? Where will federated querying happen? SJ
On Tue, Sep 3, 2024 at 4:01 PM Guillaume Lederrey glederrey@wikimedia.org wrote:
Hi all!
As part of the WDQS Graph Split project,[1] we have new SPARQL endpoints available for serving the “main”[2] and “scholarly”[3] subgraphs of Wikidata.
As you might be aware we are addressing the Wikidata Query Service stability and scaling issues. We have been working on several projects to address these issues. This announcement is about one of them, the WDQS Graph Split.[1] This change will have an impact on certain uses of the Wikidata Query Service.
We are now entering a transition period until the end of February 2025. The three SPARQL endpoints will remain in place until the end of the transition. At the end of the transition, query.wikidata.org will serve the main Wikidata subgraph (without scholarly articles). The query-main and query-scholarly endpoints will continue to be available after the transition.
If you know to want more this change, please refer to the talk page on Wikidata.[4]
Have fun!
Guillaume
[1] https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/WDQS_graph_split [2] https://query-main.wikidata.org [3] https://query-scholarly.wikidata.org [4] https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/WDQS_backend_upd...
-- *Guillaume Lederrey* (he/him) Engineering Manager Wikimedia Foundation https://wikimediafoundation.org/ _______________________________________________ Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org To unsubscribe send an email to wikitech-l-leave@lists.wikimedia.org
https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
-- Samuel Klein @metasj w:user:sj +1 617 529 4266 _______________________________________________ Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org To unsubscribe send an email to wikitech-l-leave@lists.wikimedia.org https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
Dear Guillaume,
thanks for sharing. Could you explain WHO has decided that the graph split will be done?
All the best Moritz
On Tue, 3 Sep 2024, 22:01 Guillaume Lederrey, glederrey@wikimedia.org wrote:
Hi all!
As part of the WDQS Graph Split project,[1] we have new SPARQL endpoints available for serving the “main”[2] and “scholarly”[3] subgraphs of Wikidata.
As you might be aware we are addressing the Wikidata Query Service stability and scaling issues. We have been working on several projects to address these issues. This announcement is about one of them, the WDQS Graph Split.[1] This change will have an impact on certain uses of the Wikidata Query Service.
We are now entering a transition period until the end of February 2025. The three SPARQL endpoints will remain in place until the end of the transition. At the end of the transition, query.wikidata.org will serve the main Wikidata subgraph (without scholarly articles). The query-main and query-scholarly endpoints will continue to be available after the transition.
If you know to want more this change, please refer to the talk page on Wikidata.[4]
Have fun!
Guillaume
[1] https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/WDQS_graph_split [2] https://query-main.wikidata.org [3] https://query-scholarly.wikidata.org [4] https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/WDQS_backend_upd...
-- *Guillaume Lederrey* (he/him) Engineering Manager Wikimedia Foundation https://wikimediafoundation.org/ _______________________________________________ Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org To unsubscribe send an email to wikitech-l-leave@lists.wikimedia.org https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
On Thu, Sep 5, 2024 at 6:02 PM Physikerwelt wiki@physikerwelt.de wrote:
Dear Guillaume,
thanks for sharing. Could you explain WHO has decided that the graph split will be done?
All the best Moritz
Hi Physikerwelt,
the problems with the Wikidata Query Service backend are being discussed since July 2021,[1] and the split in the graph has been introduced as a possibility in October 2023.[2] We communicated about it periodically (maybe not to the best of our possibilities, for which I am willing to take the blame), but we've kept our communication open with the most affected users during the whole time.
Cheers,
[1] https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/WDQS_backend_upd... [2] https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/WDQS_backend_upd...
Dear Luca,
the communication was good I think.
However, I don't understand the decision process. Who is the responsible person, e.g., the product manager, who eventually decided to cut WDQS into pieces?
Moritz
On Thu, Sep 5, 2024 at 7:45 PM Luca Martinelli [Sannita@WMF] sannita@wikimedia.org wrote:
On Thu, Sep 5, 2024 at 6:02 PM Physikerwelt wiki@physikerwelt.de wrote:
Dear Guillaume,
thanks for sharing. Could you explain WHO has decided that the graph split will be done?
All the best Moritz
Hi Physikerwelt,
the problems with the Wikidata Query Service backend are being discussed since July 2021,[1] and the split in the graph has been introduced as a possibility in October 2023.[2] We communicated about it periodically (maybe not to the best of our possibilities, for which I am willing to take the blame), but we've kept our communication open with the most affected users during the whole time.
Cheers,
[1] https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/WDQS_backend_upd... [2] https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/WDQS_backend_upd... _______________________________________________ Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org To unsubscribe send an email to wikitech-l-leave@lists.wikimedia.org https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
On Thu, Sep 5, 2024 at 8:15 PM Physikerwelt wiki@physikerwelt.de wrote:
Dear Luca,
the communication was good I think.
However, I don't understand the decision process. Who is the responsible person, e.g., the product manager, who eventually decided to cut WDQS into pieces?
Moritz
The decision was defined together by the Search Team at Wikimedia Foundation and the people in charge of Wikidata at Wikimedia Deutschland. In particular, Lydia Pintscher is ultimately responsible for all Wikidata product decisions at WMDE.
I would like to stress how this decision was not taken lightheartedly. It's been three years that we know that Blazegraph is on the verge of failing, we wrote a playbook in case of dramatic failure,[1] and we evaluated several alternatives to Blazegraph,[2] each with its pros and cons.
While evaluating which way to go, knowing that no solution would be a "magic wand" that would magically solve all our problems - in fact, no "magic solution" exists, each comes with its load of problems and costs - we came to terms with the fact that we needed more time and the split was a harsh, but ultimately effective solution to buy some time in the transition to the next backend.
Hope this helps.
L.
[1] https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/WDQS_backend_upd... [2] https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/WDQS_backend_upd...
On Fri, Sep 6, 2024 at 4:52 AM Luca Martinelli [Sannita@WMF] < sannita@wikimedia.org> wrote:
no "magic solution" exists, each comes with its load of problems and costs
Given the reload speed, approach to more continuous updating https://github.com/ad-freiburg/qlever/wiki/QLever-support-for-SPARQL-1.1-Update, and recent performance https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/WDQS_backend_update/WDQS_backend_alternatives#A_Report_on_Using_QLever benchmarks https://github.com/ad-freiburg/qlever/wiki/QLever-performance-evaluation-and-comparison-to-other-SPARQL-engines from the page you referenced, QLever seems pretty magical. [it was less so when the initial evaluation of backend alternatives came out] It's also cheap enough to run at home that some people are scratching their own itch now when they have queries that time out on WDQS, as Peter highlights.
Iterating on that benchmark until no one has any concerns with its applicability to our use case seems like a short-term high-return investment. SJ
Hope this helps. L.
[2] https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/WDQS_backend_upd...
Looking to alternatives to Blazegraph (Qlever being one of those) is definitely on our list of things to do once we have completed our current project of splitting the graph!
On Fri, 6 Sept 2024 at 17:47, Samuel Klein meta.sj@gmail.com wrote:
On Fri, Sep 6, 2024 at 4:52 AM Luca Martinelli [Sannita@WMF] < sannita@wikimedia.org> wrote:
no "magic solution" exists, each comes with its load of problems and costs
Given the reload speed, approach to more continuous updating https://github.com/ad-freiburg/qlever/wiki/QLever-support-for-SPARQL-1.1-Update, and recent performance https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/WDQS_backend_update/WDQS_backend_alternatives#A_Report_on_Using_QLever benchmarks https://github.com/ad-freiburg/qlever/wiki/QLever-performance-evaluation-and-comparison-to-other-SPARQL-engines from the page you referenced, QLever seems pretty magical. [it was less so when the initial evaluation of backend alternatives came out] It's also cheap enough to run at home that some people are scratching their own itch now when they have queries that time out on WDQS, as Peter highlights.
Iterating on that benchmark until no one has any concerns with its applicability to our use case seems like a short-term high-return investment. SJ
Hope this helps. L.
[2] https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/WDQS_backend_upd...
Wikidata mailing list -- wikidata@lists.wikimedia.org Public archives at https://lists.wikimedia.org/hyperkitty/list/wikidata@lists.wikimedia.org/mes... To unsubscribe send an email to wikidata-leave@lists.wikimedia.org
wikidata-tech@lists.wikimedia.org