Today we have deployed an updated version of the webservicemonitor service that we use to help ensure that `webservice --backend=gridengine ...` processes are actively running on the job grid. The main change in this new version is that we have implemented tracking of the timestamp of past restart attempts for each tool and a restart rate limit. The initial limit we have set for this is 3 restarts per 60 minute sliding window.
This change will not stop a tool maintainer from running `webservice restart` manually. You can read more of the reasoning behind the change at https://phabricator.wikimedia.org/T107878.
Bryan
cloud-announce@lists.wikimedia.org