On 13.02.2016 23:50, Kingsley Idehen wrote: ...
Markus and others interested in this matter,
What about using OFFSET and LIMIT to address this problem? That's what we advice users of the DBpedia endpoint (and others we publish) to do.
We have to educate people about query implications and options. Even after that, you have the issue of timeouts (which aren't part of the SPARQL spec) that can be used to produce partial results (notified via HTTP headers), but that's something that comes after the basic scrolling functionality of OFFSET and LIMIT are understood.
I think this does not help here. If I only ask for part of the data (see my previous email), I can get all 300K results in 9.3sec. The size of the result does not seem to be the issue. If I add further joins to the query, the time needed seems to go above 10sec (timeout) even with a LIMIT. Note that you need to order results for using LIMIT in a reliable way, since the data changes by the minute and the "natural" order of results would change as well. I guess with a blocking operator like ORDER BY in the equation, the use of LIMIT does not really save much time (other than for final result serialisation and transfer, which seems pretty quick).
Markus
[1] http://stackoverflow.com/questions/20937556/how-to-get-all-companies-from-db... [2] https://sourceforge.net/p/dbpedia/mailman/message/29172307/
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata