The choice for SPARQL was not made by me or by anyone
who has a special
interest in pushing this particular formalism (in fact Nik and Stas can
confirm that I have been quite sceptical about the feasibility of using
BlazeGraph at first). It was the result of an open-minded discussion
We all has some skepticism, not specifically BlazeGraph but in general
RDF as an underlying data model due to significant complexity in
Wikidata's own data which requires some work to fit into the triples
model. After constructing the big spreadheet and analyzing all the
options and thinking a bit more on the data model and its usage, we
however changed our opinion and decided that the problems which we face
are solvable and that solving them would be the way to go.
BlezaGraph specifically emerged as the best available solution due to
combination of features, extensibility, openness and support provided by
their team. The fact that we are basing on existing technology
(RDF/SPARQL) with developed practices was a factor, but not the only
We can not claim we know with absolute certainty the only best way to
proceed. We can, however, make a honest effort to evaluate all available
options and choose the one that we perceive to be the best at the
moment. That's what we did. Of course, as we gain more experience and as
environments change, we may add another option or even arrive to the
conclusion we were mistaken. There's no guarantee against that. But for
now we're proceeding with what we have as the best.
As for WDQ, it being a simple language it probably not hard to translate
to SPARQL. I'm not sure if that would be good SPARQL but I hope query
optimizer would take care of that (yeah, I know it's not magic but we'll
see). I'll try to put something together pretty soon and see how it
behaves. Some of the WDQ features - such is wide branching with "OR"
options - may be quite inefficient in SPARQL, but we can generate it
anyway. I'll update when I have something interesting (probably next week).