Heya,
A lot of us who write Python code use the multiprocessing module because it's an easy way to distribute the workload among many cpu's. But when you do, please do not allocate all cores to your jobs, because it basically makes a box unavailable to other folks (particularly when your jobs are long-running). You can use the multiprocessing.cpu_count() function to determine the number of available cores and subtract 1 or 2 to make sure that there is some slack available for other processes.
thx!
D
Another nice way to make sure your process doesn't inconvenience users of the machine is to use the unix utility nice.
https://en.wikipedia.org/wiki/Nice_(Unix)
This utility lowers the priority of the instructions your process sends to the CPU. This allows you to make use of all *available* resources without getting in the way of others running non-nice'd processed.
-Aaron
On Wed, Aug 28, 2013 at 4:00 PM, Diederik van Liere <dvanliere@wikimedia.org
wrote:
Heya,
A lot of us who write Python code use the multiprocessing module because it's an easy way to distribute the workload among many cpu's. But when you do, please do not allocate all cores to your jobs, because it basically makes a box unavailable to other folks (particularly when your jobs are long-running). You can use the multiprocessing.cpu_count() function to determine the number of available cores and subtract 1 or 2 to make sure that there is some slack available for other processes.
thx!
D
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
On Wed, Aug 28, 2013 at 10:04 PM, Aaron Halfaker aaron.halfaker@gmail.com wrote:
Another nice way to make sure your process doesn't inconvenience users of the machine is to use the unix utility nice.
I don't know much about the environment(s) in question but the best way usually is to make concurrency, niceness, etc. configurable and to let the person deciding where to run it and what else to run at the same time/place also tweak those parameters at the same time. (a sysadmin?)
Or is this just a big shared system with everyone starting their own services themselves? (then maybe you want to take some hints from labs/toolserver. no long running processes or batch jobs unless they use the batch queuing system. e.g. grid engine)
-Jeremy
Jeremy,
I believe that Diederik was referring to a shared system that we do a lot of basic data crunching on. We analysts are a relatively small group, so structured job scheduling seems a little heavy handed.
-Aaron
On Wed, Aug 28, 2013 at 5:13 PM, Jeremy Baron jeremy@tuxmachine.com wrote:
On Wed, Aug 28, 2013 at 10:04 PM, Aaron Halfaker aaron.halfaker@gmail.com wrote:
Another nice way to make sure your process doesn't inconvenience users of the machine is to use the unix utility nice.
I don't know much about the environment(s) in question but the best way usually is to make concurrency, niceness, etc. configurable and to let the person deciding where to run it and what else to run at the same time/place also tweak those parameters at the same time. (a sysadmin?)
Or is this just a big shared system with everyone starting their own services themselves? (then maybe you want to take some hints from labs/toolserver. no long running processes or batch jobs unless they use the batch queuing system. e.g. grid engine)
-Jeremy
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics