On 25/10/2015 09:31, Markus Krötzsch wrote:
On 25.10.2015 02:18, Kingsley Idehen wrote:
On 10/24/15 10:51 AM, Markus Krötzsch wrote:
We were talking about *cyclic data* not cyclic queries (which you can also create easily using BGPs, but that's unrelated here). Apparently, BlazeGraph has performance issues when computing a path expression over a cyclic graph.
Markus
Markus,
Out of curiosity, can you share a SPARQL query example (text or query results url) that demonstrates your point?
You mean a query with BlazeGraph having performance issues? That problem was reported by Stas. He should have examples. In any case, it is always a combination of query and data.
Hi Kingsley,
I had a problem with Blazgraph queries that had path requirements containing a compound path predicate, and ending in a variable, eg
wd:Q289 wdt:P31/wdt:P279* ?o.
However, this particular example now appears to work. (With the recent upgrade of the SPARQL endpoint to the latest Blazegraph production release ?)
On the other hand, it appears that path queries can still fail if they involve a variable intended to be a fixed constant set by a BIND statement (usually the first thing a query engine will do).
So, for example, a query to count incidences of instances of subclasses of painting, where the key requirement statement is
?a wdt:P31/wdt:P279* wd:Q3305213
runs in about 0.4 seconds. However, a very similar query where the identity of that target superclass is set using a BIND statement,
BIND (wd:Q3305213 AS ?class) . ?a wdt:P31/wdt:P279* ?class .
times out -- or rather: it ought to be reporting that it has timed out, and used to, but now it doesn't throw a "Query Timed Out" error, but instead now after 120 seconds returns an (incorrect) count of zero. (An additional, new bug).
Complete versions of these queries can be found at https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/suggestions#Path...
and as a Blazegraph bug at
https://jira.blazegraph.com/browse/BLZG-1543
(although, as with a couple of other issues described on the same wiki page linked above, that I've filed a Blazegraph bug for, there doesn't seem to be any indication that anybody has actually read the bug...)
I'm not sure if Stas knows of other current issues with path queries.
I did post a complaint to this list, just after the query service was publicly announced, that path queries seemed very slow. They *are* still slower than the equivalent search on WDQ. But I think it was this issue with binding variables that was underlying the worst of what I was seeing.
As for cyclical paths, as I posted a couple of days ago, the queries at https://www.wikidata.org/wiki/Wikidata:WikiProject_Names/given-name_variants for counting up incidences of given-name variants involve graphs that are anything but directed (based on the P460 "said to be the same as" property), and Blazegraph seems to handle them without any particular difficulty; though it's possible that there may have been earlier problems when the service was still at an alpha stage.
-- James.