On 11.03.2015 00:44, Magnus Manske wrote:
To be fair, the discussion is not "what will we
do till the end of
time", rather "what do we start with".
Knowing neither SPARQL nor the data storage engine terribly well, it
would not be helpful if the service can be DOSed by innocent-looking
queries, intentional or not. Exposing only a subset of SPARQL (in this
case, via WDQ wrapper) initially would be a way to test the waters. A
proper SPARQL API can be exposed at any time later, once we're confident
it will hold up.
This seems more like a technical decision in terms of "operational
security", rather than a philosophical one about the merits of query
languages (where SPARQL is undoubtedly more powerful than WDQ).
Sure, but my point is that there is zero evidence right now that such a
WDQ wrapper would be more robust against intentional DOS. As I explained
in my email, such a wrapper would still use a significant amount of
SPARQL features in the back. I am sure there will be cases when the new
service will go down (we have seen it happening to WDQ and, more
generally, to Wikipedia, in the past). What I don't see is how the use
of a WDQ API on top of SPARQL would make the overall setup any less
vulnerable; it mainly introduces an additional component on top of
SPARQL, and we can have a simpler SPARQL-based filter component there if
we want, which is likely to be more effective in controlling usage. The
only thing that could really lead to a more robust setup would be the
use of a more robust backend engine, and I don't see what this should be.
The discussion here is not about which query language we should use.
What Daniel proposes is to give up on supporting a standard query
language and restricting to a special-purpose API. This is a big deal.
If we really want a special-purpose query language for ourselves, we
would need to have a discussion about it. WDQ is a useful baseline, but
it is is the result of an evolution of ideas and features over time. One
would probably come up with a few different decisions when seeing the
whole picture from the start. There is a huge cost to designing a query
API from scratch, and I would really like to avoid this. Supporting WDQ
on top of SPARQL would retain WDQ in its current form and still support
standards -- if we want to develop an official custom API, we will give
up on both of these benefits, and at the same time push the ETA for
Wikidata queries far into the future.
All of this has been discussed and considered in the past. I don't see
why one would be kicking off discussions now that question everything
decided in meetings and telcos over the past weeks. There is absolutely
no new information compared to what has led to the consensus that we all
(including Daniel) had reached.
Regards,
Markus