Hi,
These emails are causing alert fatigue.
We've tweaked the thresholds high enough to make them rare but they still ocurr and we never take any action (in part because there's nothing feasible to be done until we change our storage situation and/or most workloads are migrated to Kubernetes where we could implement better controls).
I'd like to propose we disable these alerts for the time being and re-evaluate our service level indicators when appropriate.
Giovanni Tirloni Operations Engineer Wikimedia Cloud Services
On Mon, Feb 4, 2019, 01:47 shinken <shinken@shinken-02.shinken.eqiad.wmflabs wrote:
Notification Type: RECOVERY
Service: High iowait Host: tools-exec-1419 Address: 10.68.23.223 State: OK
Date/Time: Mon 04 Feb 03:46:59 UTC 2019
Notes URLs:
Additional Info:
OK: All targets OK