Hello all,
yesterday Merlissimo and I successfully tested the installation of the SGE- version for the toolserver. The last step is now to install the new version on the live-system. For that, the SGE-service needs to stop completely on the cluster, the old version has to be removed and the new one has to be installed. We plan to to this on
Thursday 5. July between 17:30 and 22:30 UTC.
During this time no SGE will work. There will be no restarting (and no migration) of stopped things after the update.
After the update is done, we will start to use the 2 Linux-boxes for tools too (I will send details than).
Sincerely, DaB.
Hello all, At Friday 06 July 2012 02:40:11 DaB. wrote:
We plan to to this on
Thursday 5. July between 17:30 and 22:30 UTC.
we had to extend the timeframe, but now the main-system is working again (with the new version!). More details tomorrow after our slumber. One important thing: If you run sge-task from the command-line, you have to logout and login one time (on each server) to get the new environment- variables.
Sincerely, DaB.
Hello, At Sunday 15 July 2012 14:15:11 DaB. wrote:
we had to extend the timeframe, but now the main-system is working again (with the new version!). More details tomorrow after our slumber.
there was never a detail-email, for which I'm sorry. So now some details: -SGE moved from /sge62 to /sge(/GE). So if you have /sge62 in your PATH, in your (login-)scripts or somewhere else you have to change that (/sge62 will vanish somewhen in near future). -You can use SGE now under linux too. -There is "-l arch=lx" now which will run your task at a linux-host. -There is also "-l arch='*'" that will run your task on linux or Solaris. -We created a shadow-master that should help if the HA-nodes are away for some reasons. -We will soon send mails if your task has used more resources than announced. -Under Linux the SGE-jobs run in a cgroup (one for each job). -Hawthorn and Clematis are not longer (available) submit-hosts. -Munin-graphs for sge are working again and can be found in the turnera- section at the moment. -SGE under linux is now handled by puppet. -qcronsub has now some colorful help-output (no sure if that is new). -The wiki-page for SGE [1] was updated.
If there are any other questions, please use the mailinglist. If you find a problem, please open a JIRA-ticket. Thanks for your patience.
Sincerely, DaB.
[1] https://wiki.toolserver.org/view/Job_scheduling
Hello again, Am Sonntag 15 Juli 2012, 14:39:18 schrieb DaB.:
-Hawthorn and Clematis are not longer (available) submit-hosts.
sorry, that was wrong, clematis and hawthorn are not longer EXECUTION-hosts, but of course there are submit-hosts.
Sincerely, DaB.
toolserver-announce@lists.wikimedia.org