* Either way, there will be a WDQ-like wrapper around SPARQL, maybe as the
official interface, maybe only at the current WDQ URL (and I'll have to
read up on SPARQL to write that, so if someone else writes it for me, all
* WDQ syntax is very limited (no references, no variables, etc), but it
covers a large amount of use cases at this point in time
* A WDQ wrapper could add some sought-after functionality quite easily
(regular expression label matching comes to mind), but it is probably not a
long-term solution, given its limitations
* A WDQ-syntax interface would be a great proof-of-concept that the new
solution can, at the very least, do what the current one does
On Tue, Mar 10, 2015 at 4:06 PM Daniel Kinzler <daniel.kinzler(a)wikimedia.de>
Am 10.03.2015 um 16:47 schrieb Markus Krötzsch:
I can understand your thoughts to some extent, but they seem to apply to
potential solution. Committing to a primary query
interface will always
well, a committment. Because of this, I think the
big advantage of
exactly that it is a technology standard that is
not depending on a
tool. If you want to minimize lock-in and be
maximally future-safe, this
to be a good thing.
Committing the the broadest possible interface, even if it's a standard,
problem I see, because it makes swapping out the backend close to
propose committing to an interface that is as narrow as it can be for our
case. That's general best practice in system design, I believe.
Note that we are not only committing to a (standardized, but very complex)
language, but also to our data mapping. WDQ would abstract from that, and
us wiggle room to adjust the mapping later.
I would certainly not support the use of a
tool-specific query language that is not specified anywhere but in
Of course the language would need to be well specified, and modified in
We'd want a production grammar, and a decent parser (recursive descend,
WDQ is great but it is a custom API of a single
than a query language.
It would be our Domain Specific Language. There's a lot to be said for
they are well documented.
* "WDQ would go away": That's not a
worry I have at all. It will be easy
write an adaptor for WDQ to SPARQL and to keep up
the service as it is
That is exactly what I'm proposing. I'd just say that the WDQ version
canonical one, while the SPARQL one would be considered raw/unstable, like
SQL databases on labs.
* "SPARQL would be too expressive, or could
have non-standard extensions
are hard to support in the future": This can
be addressed in two ways.
way is to document clearly which features are
supported, and to maintain
backwards compatibility only wrt these.
This documentation is unlikely to be complete, and people will use what
"works now", and complain when it breaks. They *will* use vendor specific
features and optimizations, even if you tell them they shouldn't. And
be trouble when they break.
The hard (as in firm, not as in
difficult) way is to restrict queries to use only such a limited set of
features. This is easy to do, since SPARQL query
already part of any DBMS that supports such
queries, and it would be
hook into this process to restrict queries
without any notable
overhead. This would minimize vendor lock-in,
since one would only
commit to (a
subset of) the fully standardized features.
That is the plan for sandboxing SPARQL. It's doable, but not easy.
"safe" WDQ on top of SPARQL is going to be simpler and quicker, I think.
give us a public query interface *faster*.
With both of these in place, your concerns should
be addressed without
to build our own query language from scratch
optimizers, user documentation, ...).
With WDQ on to of SPARQL, we need a parser and a SPARQL emitter, that's it.
Documentation is already there (well, to a degree), and optimization is
by the SPARQL endpoint.
Moreover, both of these can be added at
any stage of the project, so we are not blocked now by having to decide
these details. Right now the main priority should
be to get something
rather than to go back to the drawing board.
Yes, absolutely, but what we make available publicly
1) has to be safe - I believe this is easier and faster to do with WDQ.
2) should be future proof - again, easier with WDQ, because it's more
restrictive and domain specific. It allows us to change the underlying
or technology. SPARQL doesn't easily.
In any case, I'm not saying we shouldn't make a SPARQL endpoint available
all. I'm saying it should not be the canonical query interface, but rather
"raw" query interface. That would give us a lot more headroom to change
later, without breaking a lot of 3rd party code.
Senior Software Developer
Gesellschaft zur Förderung Freien Wissens e.V.
Wikidata-tech mailing list