Peter Körner <osm-lists(a)mazdermind.de> wrote:
Since a few days I'm getting weird errors when
submitting tasks.
My Cronjob calls
"/home/mazder/public_html/replicate-sequences/update-submit.sh"
which conains the following command:
qcronsub -l h_rt=0:05:00 -l virtual_free=100M -l
arch=* -l
sql-user-m=1 -N mazder-replicate-sequences -m as -o
'/home/mazder/public_html/replicate-sequences/sge'
/home/mazder/public_html/replicate-sequences/update-run.sh'
Most of these calls produce the error below, which
seems not
to be an error in my code as I neither use xml nor python.
Do you have any Idea what's going wrong?
[...]
An educated guess: The Python errors come from the script
/sge/GE/bin/sol-amd64/qjobtest that is called as part of
qcronsub to test whether a job with that name is already
running. qjobtest parses the output of "qstat -xml ..."
which in normal operation returns a valid XML document. My
assumption is that when SGE is down, qstat returns the error
messages ("error: commlib error: can't connect to service
(Connection refused)", etc.) as plain text which can't be
parsed as XML which in return causes qjobtest to barf.
In short: This is another artefact of SGE being down at that
moment, you can't do anything about it, just ignore.
Tim