Ah, thanks Jerven. How do you deal with http-layer query timeouts? Are you able to predict them for certain common queries rather than waiting for the timeout to hit? S
On Fri, Jan 13, 2023 at 4:02 AM Jerven Tjalling Bolleman <firstname.lastname@example.org> wrote:
Regarding these FAIR use settings. They are tuneable and maybe
turned off, so the specific
values that Openlink uses may or may not be used if wikidata would
host itself a virtuoso instance.
e.g. for sparql.uniprot.org you are unlikely to run into these
limits (as the values are set very high indeed)
and are more likely to suffer from settings around the http layer
that limit query run time due to connection issues.
On 1/12/23 11:45 PM, Kingsley Idehen
via Wikidata wrote:
On 1/12/23 3:39 AM, Larry Gonzalez wrote:
Let me start saying that I appreciate and thank the effort of
loading complete wikidata over a graph database and make and
sparql endpoint available. I know it is not an easy task to do
I just tried out the new virtuoso-hosted sparql endpoint with
some queries. My experiments are not exhaustive at all, but I
just wanted to raise two concern that I detected
Considering a (very simple) query that count all humans:
SELECT (count(?human) as ?c)
?human wdt:P31 wd:Q5 .
I get a result of 10396057, which is ok considering the dataset
that you are using
But if we try to export all instances of human (on a tsv file)
with the following query:
Then I only get 100000 results. Is there a limit over the number
of results that a query can have?
Yes, because these services are primarily for ad-hoc querying
rather than wholesale data exports. If you want to export massive
amounts of data then you can do so using OFFSET and LIMIT.
Alternatively, you can instantiate your own instance in the Azure
or AWS cloud and use as you see fit.
Like what we provide regarding DBpedia, there's a server side
configuration in place for enforcing a "fair use" policy :)
Furthermore, if we want to get all humans ordered by id, then
the endpoint times out. The following is the query:
?human wdt:P31 wd:Q5 .
ORDER BY DESC(?human)
If you set the query timeout to a value over 1000 msecs, the
Virtuoso Anytime Query feature will provide you with a partial
solution which you can use in conjunction with OFFSET and LIMIT to
creative an interactive cursor (or scrollable cursor). Beyond
that, its back to the "fair use" policy and option to instantiate
your own service-specific instance using our cloud offerings.
Thank you again for all your efforts. I am looking forward to
see how this new endpoint work, :)