After the initial enthusiasm, I have grown increasingly wary of the prospect of
exposing a SPARQL endpoint as Wikidata's canonical query interface. I decided to
share my (personal and unfinished) thoughts about this on this list, as food for
thought and a basis for discussion.
Basically, I fear that exposing SPARQL will lock us in with respect to the
backend technology we use. Once it's there, people will rely on it, and taking
it away would be very harsh. That would make it practically impossible to move
to, say, Neo4J in the future. This is even more true if if expose vendor
specific extensions like RDR/SPARQL*.
Also, exposing SPARQL as our primary query interface probably means abruptly
discontinuing support for WDQ. It's pretty clear that the original WDQ service
is not going to be maintained once the WMF offers infrastructure for wikidata
queries. So, when SPARQL appears, WDQ would go away, and dozens of tools will
need major modifications, or would just die.
So, my proposal is to expose a WDQ-like service as our primary query interface.
This follows the general principle having narrow interfaces to make it easy to
swap out the implementation.
But the power of SPARQL should not be lost: A (sandboxed) SPARQL endpoint could
be exposed to Labs, just like we provide access to replicated SQL databases
there: on Labs, you get "raw" access, with added performance and flexibility,
but no guarantees about interface stability.
In terms of development resources and timeline, exposing WDQ may actually get us
a public query endpoint more quickly: sandboxing full SPARQL may likely turn out
to be a lot harder than sandboxing the more limited set of queries WDQ allows.
Finally, why WDQ and not something else, say, MQL? Because WDQ is specifically
tailored to our domain and use case, and there already is an ecosystem of tools
that use it. We'd want to refine it a bit I suppose, but by and large, it's
pretty much exactly what we need, because it was built around the actual demand
for querying wikidata.
So far my current thoughts. Note that this is not a decision or recommendation
by the Wikidata team, just my personal take.
Senior Software Developer
Gesellschaft zur Förderung Freien Wissens e.V.