Hi!
For me, it’s perfectly ok when a query runs for 20 minutes, when it spares me some hours of setting up a specific environment for one specific dataset (and doing it again when I need current data two month later). And it would be no issue if the query runs much longer, in situations where it competes with several others. But of course, that’s not what I want to experience when I use a wikidata service to drive, e.g., an autosuggest function for selecting entities.
I understand that, but this is a shared server which is supposed to serve many users, and if we allow to run 20-minute queries on this service, soon enough it would become unusable. This is why we have 30-second limit on the server.
Now, we have considered having an option for the server or setup that allows to run longer queries, but currently we don't have one. It would require some budget allocation and work to make it, so it's not something we can have right now. There are use cases for very long queries and very large results, the current public service endpoint is just not good in serving them, because it's not what it was meant for.
And do you think the policies and limitations of different access strategies could be documented? These could include a high-reliability
I agree that limitations better to be documented, the problem is we don't know everything we may need to document. Such as "what are queries that may be bad". When I see something like "I want to download million-row dataset" I know it's probably a bit too much. But I can't have hard rule that says 1M-1 is ok, but 1M is too much.
preferred option). And on the other end of the spectrum something what allows people to experiment freely. Finally, the latter kind of
I'm not sure how I could maintain an endpoint that would allow people to do anything they want and still provide adequate experience for everybody. Maybe if we had infinite hardware resources... but we do not.
Otherwise, it is possible - and should not be extremely hard - to set one's own instance of the Query Service and use it for experimenting with heavy lifting. Of course, that would require resources - but there's no magic here, it'd require resources from us too, both in terms of hardware and people that would maintain it. So some things we can do now, some things we would be able to do later, and some things we probably would not be able to offer with any adequate quality.