On Wed, May 9, 2018 at 1:07 PM, Jaime Crespo jcrespo@wikimedia.org wrote:
https://grafana.wikimedia.org/dashboard/file/server-board.json?refresh=1m&am...
What to do?
We don't have more hardware at this point, so I guess the two things that come to mind are: * implement more query duration throttling * find users making poorly written queries and try to help them improve
Throttling is not something that we love doing, but it is easy to scale on the implementation side. Helping people fix bad queries is more fun because it makes things nicer for everyone, but it takes a lot more time and that can be difficult to find.
Jaime, do you have any ideas that are better or a preference on which we try first?
Bryan