Re: [Wikidata-tech] Thoughts on (not) exposing a SPARQL endpoint

11 Mar 2015

On 11.03.2015 00:44, Magnus Manske wrote:
...
  To be fair, the discussion is not "what will we
do till the end of
 time", rather "what do we start with".

 Knowing neither SPARQL nor the data storage engine terribly well, it
 would not be helpful if the service can be DOSed by innocent-looking
 queries, intentional or not. Exposing only a subset of SPARQL (in this
 case, via WDQ wrapper) initially would be a way to test the waters. A
 proper SPARQL API can be exposed at any time later, once we're confident
 it will hold up.

 This seems more like a technical decision in terms of "operational
 security", rather than a philosophical one about the merits of query
 languages (where SPARQL is undoubtedly more powerful than WDQ).

Sure, but my point is that there is zero evidence right now that such a 
WDQ wrapper would be more robust against intentional DOS. As I explained 
in my email, such a wrapper would still use a significant amount of 
SPARQL features in the back. I am sure there will be cases when the new 
service will go down (we have seen it happening to WDQ and, more 
generally, to Wikipedia, in the past). What I don't see is how the use 
of a WDQ API on top of SPARQL would make the overall setup any less 
vulnerable; it mainly introduces an additional component on top of 
SPARQL, and we can have a simpler SPARQL-based filter component there if 
we want, which is likely to be more effective in controlling usage. The 
only thing that could really lead to a more robust setup would be the 
use of a more robust backend engine, and I don't see what this should be.

The discussion here is not about which query language we should use. 
What Daniel proposes is to give up on supporting a standard query 
language and restricting to a special-purpose API. This is a big deal. 
If we really want a special-purpose query language for ourselves, we 
would need to have a discussion about it. WDQ is a useful baseline, but 
it is is the result of an evolution of ideas and features over time. One 
would probably come up with a few different decisions when seeing the 
whole picture from the start. There is a huge cost to designing a query 
API from scratch, and I would really like to avoid this. Supporting WDQ 
on top of SPARQL would retain WDQ in its current form and still support 
standards -- if we want to develop an official custom API, we will give 
up on both of these benefits, and at the same time push the ETA for 
Wikidata queries far into the future.

All of this has been discussed and considered in the past. I don't see 
why one would be kicking off discussions now that question everything 
decided in meetings and telcos over the past weeks. There is absolutely 
no new information compared to what has led to the consensus that we all 
(including Daniel) had reached.

Regards,

Markus

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

Re: [Wikidata-tech] Thoughts on (not) exposing a SPARQL endpoint