Hi!
You ask for understanding but there is so little to go
on. When you say
overuse, what does that mean? Obviously it was to be expected that there
That means people (not frequently, but occasionally) were running tons
of heavy queries that got service clogged and unusable for others. We
want to ensure maximum availability for everyone, but the resources are
finite. So, somebody who runs heavy queries will have to pace them so
that others could use the service too. It doesn't block heavy queries
outright, but it doesn't allow to produce so many of them that it
affects service availability.
What is the expectation of the growth of the Wikidata
Query Service,
and the SPARQL endpoint? What preparations have been made to
accommodate this? What will happen when this growth is 1000 times bigger
than what you expect?
Frankly, as of now, I do not know the full answer to that. But you are
right, it is something that needs to be thought about, and we are
thinking about it. However, this particular issue is not about that -
this one is about ensuring that existing resources are fairly shared
between the users. I don't know any service in existence that has
infinite resources, and they all have usage limits, explicit or
implicit. We are not the exception.
If your use case requires more resources than the default allocation
allows, please talk to us and we can tune it or seek some alternative
solution.
As a statement of fact with a validity of say a week
your message is
fine. After this week it is really important to know and experience what
you deliver to make the existing growth happen without resorting to the
throttling the current use of our services.
As I said, I don't think it is possible to run public service without at
least *some* kind of limits. We are just making those limits explicit
and allocated in a way that allows fair sharing of existing resources.
If current limits look too strict, they are configurable and we can
learn from experience and re-allocate them.
If it happens that current resources are not enough, we'll request more,
but it is not the case now - we are now completely OK with regular
workloads. Unfortunately - most probably due to mistakes in coding -
occasionally we get clients that produce exceptionally heavy workloads
that block the service for other users. Throttling is meant to reduce
the effect of such workloads and enforce availability to all clients. It
is not our purpose to block any legit workload, and if that happens,
please tell us and we'll look into adjusting the limits.
Thanks,
--
Stas Malyshev
smalyshev(a)wikimedia.org