Hi Joachim,
Stas would be the right person to discuss service parameters and the possible setup of more servers with other parameters. He is part of the team at WMF who is in charge of the SPARQL ops.
You note that "it isn’t always obvious what is right and what the limitations of a tool are". I think this is the key point here. There is not enough experience with the SPARQL service yet to define very clear guidelines on what works and what doesn't. On this mailing list, we have frequently been reminded to use LIMIT in queries to make sure they terminate and don't overstress the server, but I guess this is not part of the official documentation you refer to. There was no decision against supporting bigger queries either -- it just did not come up as a major demand yet, since typical applications that use SPARQL so far require 10s to 1000s of results but not 100,000s to millions. To be honest, I would not have expected this to work so well in practice that it could be considered here. It is interesting to learn that you are already using SPARQL for generating custom data exports. It's probably not the most typical use of a query service, but at least the query language could support this usage in principle.
Cheers,
Markus
On 11.02.2016 19:32, Neubert, Joachim wrote:
Hi Lydia,
I agree on using the right tool for the job. Yet, it isn’t always obvious what is right and what the limitations of a tool are.
For me, it’s perfectly ok when a query runs for 20 minutes, when it spares me some hours of setting up a specific environment for one specific dataset (and doing it again when I need current data two month later). And it would be no issue if the query runs much longer, in situations where it competes with several others. But of course, that’s not what I want to experience when I use a wikidata service to drive, e.g., an autosuggest function for selecting entities.
So, can you agree to Markus suggestion that an experimental “unstable” endpoint could solve different use cases and expectiations?
And do you think the policies and limitations of different access strategies could be documented? These could include a high-reliability interface for a narrow range of queries (as Daniel suggests as his preferred option). And on the other end of the spectrum something what allows people to experiment freely. Finally, the latter kind of interface could allow new patterns of usage to evolve, with perhaps a few of them worthwhile to become part of an optimized, highly reliabile query set.
I could imagine that such a documentation of (and perhaps discussion on) different options and access strategies, limitations and tradeoffs could solve Gerards claim to give people what they need, or at least let them make informed choises when restrictions are unavoidable.
Cheers, Joachim
*Von:*Wikidata [mailto:wikidata-bounces@lists.wikimedia.org] *Im Auftrag von *Lydia Pintscher *Gesendet:* Donnerstag, 11. Februar 2016 17:55 *An:* Discussion list for the Wikidata project. *Betreff:* Re: [Wikidata] SPARQL CONSTRUCT results truncated
On Thu, Feb 11, 2016 at 5:53 PM Gerard Meijssen <gerard.meijssen@gmail.com mailto:gerard.meijssen@gmail.com> wrote:
Hoi,' Markus when you read my reply on the original question you will see that my approach is different. The first thing that I pointed out was that a technical assumption has little to do with what people need. I indicated that when this is the approach, the answer is fix it. The notion that a large number of returns is outrageous is not of this time. My approach was one where I even offered a possible solution, a crutch. The approach Daniel took was to make me look ridiculous. His choice, not mine. I stayed polite and told him that his answers are not my answers and why. The point that I make is that Wikidata is a service. It will increasingly be used for the most outrageous queries and people will expect it to work because why else do we put all this data in there. Why else is this the data hub for Wikipedia. Why else Do appreciate that the aim of the WMF is to share in the sum of all available knowledge. When the current technology is what we have to make do with, fine for now. Say so, but do not ridicule me for saying that it is not good enough, it is not now and it will certainly not be in the future... Thanks, GerardM
Gerard, it all boils down to using the right tool for the job. Nothing more - nothing less. Let's get back to making Wikidata rock.
Cheers Lydia
--
Lydia Pintscher - http://about.me/lydia.pintscher
Product Manager for Wikidata
Wikimedia Deutschland e.V.
Tempelhofer Ufer 23-24
10963 Berlin
www.wikimedia.de http://www.wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/029/42207.
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata