Hello,
Is there a way analyze the wikipedia logs to figure out what processes
takes the most time? There is no immediate need, but I wanted to shoot
off an idea to consider. If we were able to capture the processes that
do take a heavy server load and push them onto a distributed process
pool, would it help wikipedia or mediawiki in general? I'd imagine there
would be a difference with process time over the speed of network
traffic. Let say we determined that the code that creates a diff for two
pages is a hog and can be put into the pool. We could use something like
BOINC,
http://boinc.berkeley.edu/, to standardize the pool. We can add
the diff process to the pool as the server load gets heavy. The use of
BOINC is more specific to research tasks, and it would need to be
different for mediawiki. I just used the idea to keep this message short
to get your feedback.
Thanks
Jonathan