Re: [Wikidata] SPARQL/BlazeGraph: label service performance

6 Mar 2016


      Hi Stas,
Thanks for the really quick reply. I agree with your analysis: it seems 
the service is implemented as a blocking operator, whereas it could 
really be streaming for local services (and maybe even for remote ones).
My version with the subquery seems really fast now, but I did not do any 
profiling. I would have thought that the service is generally faster 
than the OPTIONAL-FILTER-LANG combination.  Would be interesting to know 
which one is better (I will use it in /many/ queries).
Regards,
Markus
On 06.03.2016 22:46, Stas Malyshev wrote:
...
Hi!
...
There is a performance issue with the labelling service. Using labels
makes even simple queries time out. For example this one:
SELECT $p $pLabel
WHERE {
    $p wdt:P31 _:bnode .
    SERVICE wikibase:label { bd:serviceParam wikibase:language "en" . }
} LIMIT 11
I suspect the issue here can be that it tries to calculate the full set
of values before applying service. Which may make sense if the service
is external, but if it is internal and result set is huge it obviously
is not working.
Other alternative can be, since you are just looking for English labels,
to use direct query approach:
SELECT $p $pLabel
WHERE {
    $p wdt:P31 _:bnode .
    OPTIONAL {
    $p rdfs:label $pLabel .
    FILTER(lang($pLabel) = "en")
    }
} LIMIT 11
This seems to work just fine. You lose a bit of added value on the
service (nicer no-label labels) but you gain a lot of speed.
In any case, I'll raise this issue with Blazegraph and it also may be
worth to submit Phabricator issue about it.
-- 
Markus Kroetzsch
Faculty of Computer Science
Technische Universität Dresden
+49 351 463 38486
http://korrekt.org/

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Re: [Wikidata] SPARQL/BlazeGraph: label service performance