[Foundation-l] Wikimedia Projects Growth Animated
Domas Mituzas
midom.lists at gmail.com
Wed Mar 26 18:57:40 UTC 2008
Hi!
this may sound as a heresy, but for some jobs, that are short in time-
span, but need lots of CPU capacity we could try using Amazon's EC2
or any other grid computing service (maybe some university wants to
donate cluster time?).
That would be much cheaper than allocating high-performance-high-
bucks hardware to projects like this.
Really, we have a capable cluster that has extra-CPU capacity for
distributed tasks, but anything what needs lots-of-memory in single
location simply doesn't scale.
Most of our tasks are scaled out, where lots of smaller machines can
do lots of big work, so this wikistats job is the only one which
cannot be distributed this way.
Eventually we may run Hadoop,Gearman or similar framework for
statistics job distribution, but really, first of all the actual
tasks have to be minimized to smaller segments, for map/reduce
operation, if needed.
I don't see many problems (except setting the whole grid up)
allocating job execution resources during off peak, on 10, 20 or 100
nodes, as long as it doesn't have exceptional resource needs on a
single node. It would be very nice practice for many other future
jobs too.
BR,
--
Domas Mituzas -- http://dammit.lt/ -- [[user:midom]]
More information about the foundation-l
mailing list