Hi Markus,
(I did read your paper ;-)
Awesome :-)
Of course all the issues I observe might be due to implementation.
Or due to the kind of queries. Some queries will always be hard; with SPARQL endpoints, you pay the price in server cost; with TPF, you pay the price in query time and bandwidth. No silver bullet.
Maybe the general idea could still be made to work somehow.
Certainly, especially if "general idea" also includes looking for other alternatives beyond TPF, as recent research has done.
TPF is by no means a final solution. It's the start of a dialogue: look what happens with scalability if we make the server more lightweight.
- "Brad Pitt" from the first page: 55sec in Firefox, but only 6sec in Chrome (!).
That's due to caching. Clearing the cache will yield more similar results.
Results are not really correct, but as you said that this should be fixed soon.
I've published v2.0.4 of the client, that should fix things when deployed.
- "Rivers in Antarctica" (with all label fetching triples removed) This is really a rather direct query, with merely one OPTIONAL and three triple patterns in total. It ran for about 5 min before stopping and clearing the timer (no results shown).
Would need to look into that; unfortunately, my job mostly allows to create prototypes, we could really use devs (and budget) to develop more :-)
- "Largest cities of the world" (with labels fetched by the plain SPARQL method as in the example query) This one got me a *server-side* timeout message after 5min of waiting; I would paste it here but the UI has reset itself before I could copy it.
This query contains a blocking operator; those are indeed hard with TPF and will always be. To make those run fast, more powerful interfaces are useful (but they come of course at a cost).
Maybe I am just trying the wrong queries?
I think so, yes.
This also show our departure from typical SemWeb ideas: with TPF, we accept that some things are slow. Any query is possible (given a completely implemented engine), but some of them just take a lot of time. It has never been my goal to be able to evaluate any query fast; I'm particularly interested in making those queries fast that humans could also easily execute over Wikipedia/Wikidata. I.e., I see the query engine mostly as an improvement over manual Web searches, not as a trimmed-down version of a centralized SPARQL engine. We just use the same language. I aim to improve the Web, not SPARQL ;-)
Do you have a sample query from the Wikidata query samples that you would recommend as a good showcase?
I don't have too much experience with Wikidata, unfortunately, but I can give you some inspiration with DBpedia: http://client.linkeddatafragments.org/ Would be a good idea to port these queries.
Best,
Ruben