-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
hi,
we are testing a new way to run tools on the Toolserver, that will allow tools to easily take advantage of spare resources, including new servers we add in the future, without any change from users.
for more information, and documentation on how to use it, please see https://wiki.toolserver.org/view/Batch_job_scheduling
- river.
River Tarnell wrote:
hi,
we are testing a new way to run tools on the Toolserver, that will allow tools to easily take advantage of spare resources, including new servers we add in the future, without any change from users.
for more information, and documentation on how to use it, please see https://wiki.toolserver.org/view/Batch_job_scheduling
- river.
I think that's a good move, but will it only run jobs on idle servers? What about people wanting to run it *now*? Or jobs which should be run continously?
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Platonides:
will it only run jobs on idle servers?
yes; or more specifically, it will run jobs on server where the load avg is less than 1.75*NCPUS (this is configurable).
i'm not sure there's much advantage to having an immediate-execution queue; would the purpose of this be to run a job on the least loaded server, even if it's already overloaded?
one thing i will probably implement is a queue based on database load, so you could submit a job to the "s1" queue, and it will be executed when the s1 database is sufficiently idle.
- river.
River Tarnell wrote:
Platonides:
will it only run jobs on idle servers?
yes; or more specifically, it will run jobs on server where the load avg is less than 1.75*NCPUS (this is configurable).
i'm not sure there's much advantage to having an immediate-execution queue; would the purpose of this be to run a job on the least loaded server, even if it's already overloaded?
It doesn't look nice. But if the task is not executed "right now" How many users will wait on the queue, and how many will ssh to run directly?
one thing i will probably implement is a queue based on database load, so you could submit a job to the "s1" queue, and it will be executed when the s1 database is sufficiently idle.
- river.
Looks good.
IMHO another thing worth documenting would be how to tell "I'm an unimportant script, suspend me if there're scripts for doing real work waiting" (some special command in the script text, I think).
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
it's now possible to schedule jobs that run SQL queries, as well as those that use CPU resources. this is described on the wiki at
https://wiki.toolserver.org/view/Batch_job_scheduling#Scheduling_SQL_queries
- river.
2009/9/16 River Tarnell river@loreley.flyingparchment.org.uk:
it's now possible to schedule jobs that run SQL queries, as well as those that use CPU resources. this is described on the wiki at
https://wiki.toolserver.org/view/Batch_job_scheduling#Scheduling_SQL_queries
I've tried to schedule a SQL query but not works:
/var/opt/sge/default/spool/willow/job_scripts/94: mysql: not found
Mauro
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Mauro Girotto:
I've tried to schedule a SQL query but not works:
/var/opt/sge/default/spool/willow/job_scripts/94: mysql: not found
this should be fixed now.
- river.
2009/9/17 River Tarnell river@loreley.flyingparchment.org.uk:
this should be fixed now.
Perfect, now works fine.
Thanks River!
River Tarnell skrev:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
it's now possible to schedule jobs that run SQL queries, as well as those that use CPU resources. this is described on the wiki at
https://wiki.toolserver.org/view/Batch_job_scheduling#Scheduling_SQL_queries
Is there any way to schedule jobs when you don't know in advance which SQL cluster they use. I normally just give the database name to my programs on the commandline and leave it to the program to look up which cluster to use.
How can I such schedule jobs?
Another matter is that I use binaries which are compiled to a specific architecture. Would it be possible in a script to test which architecture it is run on, and then select the binary to run accordingly? Any code exeamples would be great.
/byrial
Byrial Jensen wrote:
River Tarnell skrev:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
it's now possible to schedule jobs that run SQL queries, as well as those that use CPU resources. this is described on the wiki at
https://wiki.toolserver.org/view/Batch_job_scheduling#Scheduling_SQL_queries
Is there any way to schedule jobs when you don't know in advance which SQL cluster they use. I normally just give the database name to my programs on the commandline and leave it to the program to look up which cluster to use.
How can I such schedule jobs?
I'd also be interested in that. All my tools use the DNS aliases for connecting to the database servers, a method which I've found to be both convenient and reliable. Would the batch job scheduling system perhaps allow for similar aliases to be defined there as well, or something?
(Also, the one tool for which I might currently use this facility actually only uses the Commons database, which is replicated to all the servers. So, technically, it could use whichever database server happened to be least loaded at the moment, if such information could be provided to it. I'd be glad for any suggestions on how to do that.)
Another matter is that I use binaries which are compiled to a specific architecture. Would it be possible in a script to test which architecture it is run on, and then select the binary to run accordingly? Any code exeamples would be great.
In a shell script, you can check the output of the uname command. In Perl, the $^O variable can be used for the same purpose. I suppose you could also just try running both binaries and see which one works...
toolserver-l@lists.wikimedia.org