On 25/10/2015 09:31, Markus Krötzsch wrote:
On 25.10.2015 02:18, Kingsley Idehen wrote:
On 10/24/15 10:51 AM, Markus Krötzsch wrote:
We were talking about *cyclic data* not cyclic queries (which you can
also create easily using BGPs, but that's unrelated here). Apparently,
BlazeGraph has performance issues when computing a path expression
over a cyclic graph.
Markus
Markus,
Out of curiosity, can you share a SPARQL query example (text or query
results url) that demonstrates your point?
You mean a query with BlazeGraph having performance issues? That problem
was reported by Stas. He should have examples. In any case, it is always
a combination of query and data.
Hi Kingsley,
I had a problem with Blazgraph queries that had path requirements
containing a compound path predicate, and ending in a variable, eg
wd:Q289 wdt:P31/wdt:P279* ?o.
However, this particular example now appears to work. (With the recent
upgrade of the SPARQL endpoint to the latest Blazegraph production
release ?)
On the other hand, it appears that path queries can still fail if they
involve a variable intended to be a fixed constant set by a BIND
statement (usually the first thing a query engine will do).
So, for example, a query to count incidences of instances of subclasses
of painting, where the key requirement statement is
?a wdt:P31/wdt:P279* wd:Q3305213
runs in about 0.4 seconds. However, a very similar query where the
identity of that target superclass is set using a BIND statement,
BIND (wd:Q3305213 AS ?class) .
?a wdt:P31/wdt:P279* ?class .
times out -- or rather: it ought to be reporting that it has timed out,
and used to, but now it doesn't throw a "Query Timed Out" error, but
instead now after 120 seconds returns an (incorrect) count of zero. (An
additional, new bug).
Complete versions of these queries can be found at
https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/suggestions#Pat…
and as a Blazegraph bug at
https://jira.blazegraph.com/browse/BLZG-1543
(although, as with a couple of other issues described on the same wiki
page linked above, that I've filed a Blazegraph bug for, there doesn't
seem to be any indication that anybody has actually read the bug...)
I'm not sure if Stas knows of other current issues with path queries.
I did post a complaint to this list, just after the query service was
publicly announced, that path queries seemed very slow. They *are*
still slower than the equivalent search on WDQ. But I think it was this
issue with binding variables that was underlying the worst of what I was
seeing.
As for cyclical paths, as I posted a couple of days ago, the queries at
https://www.wikidata.org/wiki/Wikidata:WikiProject_Names/given-name_variants
for counting up incidences of given-name variants involve graphs that
are anything but directed (based on the P460 "said to be the same as"
property), and Blazegraph seems to handle them without any particular
difficulty; though it's possible that there may have been earlier
problems when the service was still at an alpha stage.
-- James.