[Labs-l] Getting SIGTERM on grid engine jobs

Tim Landscheidt tim at tim-landscheidt.de
Tue Feb 17 23:25:25 UTC 2015


"Marc A. Pelletier" <marc at uberbox.org> wrote:

>> Unfortunately, now that I test it, it doesn't seem to work even with -m
>> a. This might be because the hosts actually are not allowed to /send/
>> e-mail, but I'm not completely sure.

> They do; it's just that the jsub frontend doesn't speak -m -
> you can use qsub directly to use that functionality,
> however.

jsub does support this option as "jsub -m e true" shows.
The "problem" however is that "aborted" in the SGE sense
does not refer to OOM-killed jobs, so there is no possibil-
ity to only send mail on OOMs, but not on normal exits
(cf. also https://phabricator.wikimedia.org/T52053).  So one
needs to use "-m ae" to be informed about OOMs.

Tim




More information about the Labs-l mailing list