Over the past 12 hours I've also gotten a fair number of error reports by mail and on irc about some of my projects.
A few samples:
* pywikipedia script:
Your "cron" job on willow python $HOME/SVN/pywikipedia/fileprotectionsync_live.py > $HOME/bots/py_fileprotectionsync_live.log 2>&1
produced the following output:
ld.so.1: sh: fatal: mmap anon failed: Resource temporarily unavailable ld.so.1: sh: fatal: /usr/lib/libc.so.1: mmap failed: Resource temporarily unavailable ld.so.1: sh: fatal: libc.so.1: open failed: No such file or directory
* custom php script for [[m:CVN]]
Your "cron" job on willow php $HOME/CVN-backend/cronjob_cvnapi.php > $HOME/CVN-backend/cronjob_cvnapi.log 2>&1
produced the following output:
Killed
* minutely start attempt for a long-running shell script in SGE
Your "cron" job on willow cronsub -l -s dbbot_wm $HOME/bots/dbbot-wm-start.sh
produced the following output:
/opt/local/bin/cronsub[52]: 28142 Killed
* minutely start attempt for a long-running shell script in SGE
Your "cron" job on willow cronsub -l -s dbbot_wm $HOME/bots/dbbot-wm-start.sh
produced the following output:
critical error: malloc() failure
* wmfCodeSearch exec
Your "cron" job on willow php $HOME/wss_backend/runJobs.php > $HOME/logs/wmfCodeSearch/runJobs.log 2>&1
produced the following output:
Segmentation Fault - core dumped
etc.etc.
Due to the diversity of the errors, I can't find any link between the various failures.
I hope these help in determining/fixing the issue.
-- Krinkle
On Apr 22, 2012, at 6:51 AM, DeltaQuad wrote:
Ok, so i've checked status.toolserver.org and have found nothing going on but some stuff has been breaking at random points today.
I've had a Cron job return: Subject: Cron deltaquad@willow cronsub IPBEBot $HOME/IPBE/IPBE.py
/opt/local/bin/cronsub[44]: 9793 Killed
I've had another return: Subject: Cron deltaquad@willow cronsub UAABot $HOME/UAA/UAA.py
error: not enough memory to allocate 2404 bytes in init_packbuffer Unable to run job: Error reading answer list from qmaster. Exiting.
And then on a MMP (yes this is a customized message):
The x script has failed. The error message received was: <b>A database error occured when attempting to process your request: </b><br />Failed to connect to database server ! Please check the database to resolve this issue and ensure that private data is removed on schedule.
Is it what we have running or is this a toolserver issue in general? and should I file a bug?
(SysAdmins - Especially DaB. please don't take this as me being critical, I just wanna help if I can identify any problems and file a JIRA if needed ;) )
-- DeltaQuad English Wikipedia Administrator
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org) https://lists.wikimedia.org/mailman/listinfo/toolserver-l Posting guidelines for this list: https://wiki.toolserver.org/view/Mailing_list_etiquette