Hi Bene,
On 29.05.2015 22:06, Bene* wrote:
Hi Markus,
Maybe it could be more efficient to do some API requests to find the right entities rather than filtering many alternative labels as part of the query. It's not a pattern that we should encourage for production ;-)
I don't think it would be more efficient to do any api request before querying blazegraph because the api is way slower than the triple search in BlazeGraph. For exposing SPARQL for newcomers it would perhaps be nicer to add the actual Q/P ids instead of huge lists of UNIONs but for the internal query I guess passing everything to BlazeGraph and let it do the right things with it is imo better and more efficient.
Two reasons:
* Performance: It depends. API might be slower in simple cases, but every label that is part of the SPARQL query is adding a join. At some point, the query will just need too much memory to run at all, and fast turns into impossible ;-). I think some of the queries the system creates are already too big for BlazeGraph. If the API is really so slow, one could use BlazeGraph like the API (issuing many small queries to fetch IDs). But in the end resolving the entities in a query needs only very few API requests, and they are only needed once when building the query.
* Utility: This is the more important reason. Resolving the Qids and Pids as part of the SPARQL generation process will make the tool more useful. You can print something like "I assume that by 'Madonna' you meant the American singer, songwriter, and actress (Q1744)" and let the user change this. As it is now, the husbands of Madonna give a rather surprising mix of results for different entities. This might still be entertaining for Madonna, but it is a real problem for questions like "Where is Paris?", which do not produce a meaningful result (you get a list of places and coordinates for different things, and you cannot find out which coordinate belongs to which place).
A label-based query paradigm can also work, but then it should produce queries that return the entities that were found for each label, so the user can at least see from the result which "Madonna" each husband belongs to.
Cheers,
Markus