Hi all,
Please find the WDQS scaling update here https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/WDQS-scaling-update-jan-2022 .
We will be trying to provide monthly updates on the scaling process starting this month. Please feel free to ask questions or comment where appropriate, and/or stop by the WMF Search Office Hours every first Wed at 16:00 UTC (see Trey’s emails for more details).
Thanks!
Mike
—
*Mike Pham* (he/him) Sr Product Manager, Search Wikimedia Foundation https://wikimediafoundation.org/
Thanks Mike,
I noted the discussion also about "authentication" with WCQS Beta 2 coming Feb. 2. Reading through some of the talk over the past few months (and catching up from holidays)... https://commons.wikimedia.org/wiki/Commons_talk:SPARQL_query_service/Upcomin...
Once a problematic query is executed, it can lock up Blazegraph and cause it to become unresponsive – because we are unable to kill the query, we are forced to restart Blazegraph manually each time, causing user-wide disruptions, and taking a lot of emergency time and energy for the WMF team to resolve (at the expense of other improvements, features, projects).
Is it still the case that the team is unable to kill a problematic query?
Thad https://www.linkedin.com/in/thadguidry/ https://calendly.com/thadguidry/
Hi Thad,
Thanks for the question. I’ll address the question to the best of my ability, but a more technically savvy person from my team can correct me if I get the details wrong.
Is it still the case that the team is unable to kill a problematic query?
Problematic queries can lock up Blazegraph, causing it to be unresponsive. In this case, the only solution is to restart Blazegraph, as we are unable to ask it to kill the problematic query. Unfortunately, restarting Blazegraph does not guarantee that the user agent can simply resend that same query again, causing the cycle to continue. This is one of the contexts in which we talk about authentication providing our team with more tools to handle problematic queries bringing down the query service.
I noted the discussion also about “authentication” with WCQS Beta 2 coming Feb. 2.
Small correction (might have been a typo), but WCQS beta 2 is scheduled to be live in Tues Feb 1.
Hope that helps answer your question!
Best, Mike
—
*Mike Pham* (he/him) Sr Product Manager, Search Wikimedia Foundation https://wikimediafoundation.org/
On 28January, 2022 at 13:17:55, Thad Guidry (thadguidry@gmail.com) wrote:
Thanks Mike,
I noted the discussion also about "authentication" with WCQS Beta 2 coming Feb. 2. Reading through some of the talk over the past few months (and catching up from holidays)... https://commons.wikimedia.org/wiki/Commons_talk:SPARQL_query_service/Upcomin...
Once a problematic query is executed, it can lock up Blazegraph and cause it to become unresponsive – because we are unable to kill the query, we are forced to restart Blazegraph manually each time, causing user-wide disruptions, and taking a lot of emergency time and energy for the WMF team to resolve (at the expense of other improvements, features, projects).
Is it still the case that the team is unable to kill a problematic query?
Thad https://www.linkedin.com/in/thadguidry/ https://calendly.com/thadguidry/
_______________________________________________ Wikidata mailing list -- wikidata@lists.wikimedia.org To unsubscribe send an email to wikidata-leave@lists.wikimedia.org
Yes it does, I'll add my comment to the discussion then:
"As the Search team has looked into this already and stated they see no alternative to knowing which users are issuing problematic queries that can bring down the service to all, then I think to ensure minimum availability to all users that authentication should be required. Without authentication, the team says that service degradation can still occur but we would not have a clear signal who might be causing a problem for all to help correct behavior or that users' needs. This is no different than other API service providers with public endpoints requiring an API user key or token for identification in case it's needed to contact that user."
Thad https://www.linkedin.com/in/thadguidry/ https://calendly.com/thadguidry/
On Fri, Jan 28, 2022 at 2:12 PM Mike Pham mpham@wikimedia.org wrote:
Hi Thad,
Thanks for the question. I’ll address the question to the best of my ability, but a more technically savvy person from my team can correct me if I get the details wrong.
Is it still the case that the team is unable to kill a problematic query?
Problematic queries can lock up Blazegraph, causing it to be unresponsive. In this case, the only solution is to restart Blazegraph, as we are unable to ask it to kill the problematic query. Unfortunately, restarting Blazegraph does not guarantee that the user agent can simply resend that same query again, causing the cycle to continue. This is one of the contexts in which we talk about authentication providing our team with more tools to handle problematic queries bringing down the query service.
I noted the discussion also about “authentication” with WCQS Beta 2 coming Feb. 2.
Small correction (might have been a typo), but WCQS beta 2 is scheduled to be live in Tues Feb 1.
Hope that helps answer your question!
Best, Mike
—
*Mike Pham* (he/him) Sr Product Manager, Search Wikimedia Foundation https://wikimediafoundation.org/
On 28January, 2022 at 13:17:55, Thad Guidry (thadguidry@gmail.com) wrote:
Thanks Mike,
I noted the discussion also about "authentication" with WCQS Beta 2 coming Feb. 2. Reading through some of the talk over the past few months (and catching up from holidays)...
https://commons.wikimedia.org/wiki/Commons_talk:SPARQL_query_service/Upcomin...
Once a problematic query is executed, it can lock up Blazegraph and cause it to become unresponsive – because we are unable to kill the query, we are forced to restart Blazegraph manually each time, causing user-wide disruptions, and taking a lot of emergency time and energy for the WMF team to resolve (at the expense of other improvements, features, projects).
Is it still the case that the team is unable to kill a problematic query?
Thad https://www.linkedin.com/in/thadguidry/ https://calendly.com/thadguidry/
Wikidata mailing list -- wikidata@lists.wikimedia.org To unsubscribe send an email to wikidata-leave@lists.wikimedia.org
Wikidata mailing list -- wikidata@lists.wikimedia.org To unsubscribe send an email to wikidata-leave@lists.wikimedia.org
Thanks Mike! A plan shared at WDC was to migrate off of blazegraph "as soon as a viable alternative is identified" – is that still the plan? As how does this affect scaling work?
On Fri, Jan 28, 2022 at 3:55 PM Thad Guidry thadguidry@gmail.com wrote:
Yes it does, I'll add my comment to the discussion then:
"As the Search team has looked into this already and stated they see no alternative to knowing which users are issuing problematic queries that can bring down the service to all, then I think to ensure minimum availability to all users that authentication should be required.
As came up in the live discussion about this (I don't recall if notes were published somewhere, or I'd link them) -- there are other ways to guess which Query Service user or session issued a problematic query without requiring full authentication; and there is no evidence that problematic query generators would try to get around simpler ways of identifying their session (or try to ignore clear feedback that their query was harmful).
'Requiring all users to auth' has known immediate downsides, and still won't prevent a user from reissuing a problematic query, until some process for feedback + warning is implemented. So it doesn't seem like an obvious next step even if it turns out to be important in the end //
SJ
Hi Samuel,
Thanks for the question (and attending our WDC session!).
Our plan is still currently to migrate off of Blazegraph as our primary priority for scaling WDQS. Our main goal in the first half of this calendar year is to identify a viable alternative to Blazegraph, and answer some questions around what a technical scaling plan looks like.
As mentioned in the Jan scaling update, Andrea Westerinen https://wikitech.wikimedia.org/wiki/User:AndreaWest has just joined our team as a Contract Graph Consultant, and will be helping us with identifying a Blazegraph replacement. Please visit her (sub)page(s) for some more good information, questions and discussions on the process!
Mike
—
*Mike Pham* (he/him) Sr Product Manager, Search Wikimedia Foundation https://wikimediafoundation.org/
On 29January, 2022 at 11:23:00, Samuel Klein (meta.sj@gmail.com) wrote:
Thanks Mike! A plan shared at WDC was to migrate off of blazegraph "as soon as a viable alternative is identified" – is that still the plan? As how does this affect scaling work?
On Fri, Jan 28, 2022 at 3:55 PM Thad Guidry thadguidry@gmail.com wrote:
Yes it does, I'll add my comment to the discussion then:
"As the Search team has looked into this already and stated they see no alternative to knowing which users are issuing problematic queries that can bring down the service to all, then I think to ensure minimum availability to all users that authentication should be required.
As came up in the live discussion about this (I don't recall if notes were published somewhere, or I'd link them) -- there are other ways to guess which Query Service user or session issued a problematic query without requiring full authentication; and there is no evidence that problematic query generators would try to get around simpler ways of identifying their session (or try to ignore clear feedback that their query was harmful).
'Requiring all users to auth' has known immediate downsides, and still won't prevent a user from reissuing a problematic query, until some process for feedback + warning is implemented. So it doesn't seem like an obvious next step even if it turns out to be important in the end //
SJ _______________________________________________ Wikidata mailing list -- wikidata@lists.wikimedia.org To unsubscribe send an email to wikidata-leave@lists.wikimedia.org
That all sounds exciting & promising. Andrea, thanks for sharing in such detail on your page! I hope you'll update this list as you reach new stages of that effort.
Warmly, SJ
On Mon, Jan 31, 2022 at 10:12 AM Mike Pham mpham@wikimedia.org wrote:
Hi Samuel,
Thanks for the question (and attending our WDC session!).
Our plan is still currently to migrate off of Blazegraph as our primary priority for scaling WDQS. Our main goal in the first half of this calendar year is to identify a viable alternative to Blazegraph, and answer some questions around what a technical scaling plan looks like.
As mentioned in the Jan scaling update, Andrea Westerinen https://wikitech.wikimedia.org/wiki/User:AndreaWest has just joined our team as a Contract Graph Consultant, and will be helping us with identifying a Blazegraph replacement. Please visit her (sub)page(s) for some more good information, questions and discussions on the process!
Mike
—
*Mike Pham* (he/him) Sr Product Manager, Search Wikimedia Foundation https://wikimediafoundation.org/
On 29January, 2022 at 11:23:00, Samuel Klein (meta.sj@gmail.com) wrote:
Thanks Mike! A plan shared at WDC was to migrate off of blazegraph "as soon as a viable alternative is identified" – is that still the plan? As how does this affect scaling work?
On Fri, Jan 28, 2022 at 3:55 PM Thad Guidry thadguidry@gmail.com wrote:
Yes it does, I'll add my comment to the discussion then:
"As the Search team has looked into this already and stated they see no alternative to knowing which users are issuing problematic queries that can bring down the service to all, then I think to ensure minimum availability to all users that authentication should be required.
As came up in the live discussion about this (I don't recall if notes were published somewhere, or I'd link them) -- there are other ways to guess which Query Service user or session issued a problematic query without requiring full authentication; and there is no evidence that problematic query generators would try to get around simpler ways of identifying their session (or try to ignore clear feedback that their query was harmful).
'Requiring all users to auth' has known immediate downsides, and still won't prevent a user from reissuing a problematic query, until some process for feedback + warning is implemented. So it doesn't seem like an obvious next step even if it turns out to be important in the end //
SJ _______________________________________________ Wikidata mailing list -- wikidata@lists.wikimedia.org To unsubscribe send an email to wikidata-leave@lists.wikimedia.org
Wikidata mailing list -- wikidata@lists.wikimedia.org To unsubscribe send an email to wikidata-leave@lists.wikimedia.org