Re: [Wikidata] weekly summary #159

29 May 2015


      Hi Bene,
On 29.05.2015 22:06, Bene* wrote:
...
Hi Markus,
...
Maybe it could be more efficient to do some API requests to find the
right entities rather than filtering many alternative labels as part
of the query. It's not a pattern that we should encourage for
production ;-)
I don't think it would be more efficient to do any api request before
querying blazegraph because the api is way slower than the triple search
in BlazeGraph. For exposing SPARQL for newcomers it would perhaps be
nicer to add the actual Q/P ids instead of huge lists of UNIONs but for
the internal query I guess passing everything to BlazeGraph and let it
do the right things with it is imo better and more efficient.
Two reasons:
* Performance: It depends. API might be slower in simple cases, but 
every label that is part of the SPARQL query is adding a join. At some 
point, the query will just need too much memory to run at all, and fast 
turns into impossible ;-). I think some of the queries the system 
creates are already too big for BlazeGraph. If the API is really so 
slow, one could use BlazeGraph like the API (issuing many small queries 
to fetch IDs). But in the end resolving the entities in a query needs 
only very few API requests, and they are only needed once when building 
the query.
* Utility: This is the more important reason. Resolving the Qids and 
Pids as part of the SPARQL generation process will make the tool more 
useful. You can print something like "I assume that by 'Madonna' you 
meant the American singer, songwriter, and actress (Q1744)" and let the 
user change this. As it is now, the husbands of Madonna give a rather 
surprising mix of results for different entities. This might still be 
entertaining for Madonna, but it is a real problem for questions like 
"Where is Paris?", which do not produce a meaningful result (you get a 
list of places and coordinates for different things, and you cannot find 
out which coordinate belongs to which place).
A label-based query paradigm can also work, but then it should produce 
queries that return the entities that were found for each label, so the 
user can at least see from the result which "Madonna" each husband 
belongs to.
Cheers,
Markus

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Re: [Wikidata] weekly summary #159