Hi!
this may sound as a heresy, but for some jobs, that are short in time- span, but need lots of CPU capacity we could try using Amazon's EC2 or any other grid computing service (maybe some university wants to donate cluster time?). That would be much cheaper than allocating high-performance-high- bucks hardware to projects like this.
Really, we have a capable cluster that has extra-CPU capacity for distributed tasks, but anything what needs lots-of-memory in single location simply doesn't scale. Most of our tasks are scaled out, where lots of smaller machines can do lots of big work, so this wikistats job is the only one which cannot be distributed this way.
Eventually we may run Hadoop,Gearman or similar framework for statistics job distribution, but really, first of all the actual tasks have to be minimized to smaller segments, for map/reduce operation, if needed. I don't see many problems (except setting the whole grid up) allocating job execution resources during off peak, on 10, 20 or 100 nodes, as long as it doesn't have exceptional resource needs on a single node. It would be very nice practice for many other future jobs too.
BR,
On Mar 26, 2008, at 6:57 PM, Domas Mituzas wrote:
Hi!
this may sound as a heresy, but for some jobs, that are short in time- span, but need lots of CPU capacity we could try using Amazon's EC2 or any other grid computing service (maybe some university wants to donate cluster time?). That would be much cheaper than allocating high-performance-high- bucks hardware to projects like this.
If you end up using the EC2 machines continuously 24/7, then in my experience it is more cost effective to run your own. If all you need is temporary large parallel execution or overflow capacity, then it is great.
Cheers Artur
wikitech-l@lists.wikimedia.org