Hello,
Is there a way analyze the wikipedia logs to figure out what processes takes the most time? There is no immediate need, but I wanted to shoot off an idea to consider. If we were able to capture the processes that do take a heavy server load and push them onto a distributed process pool, would it help wikipedia or mediawiki in general? I'd imagine there would be a difference with process time over the speed of network traffic. Let say we determined that the code that creates a diff for two pages is a hog and can be put into the pool. We could use something like BOINC, http://boinc.berkeley.edu/, to standardize the pool. We can add the diff process to the pool as the server load gets heavy. The use of BOINC is more specific to research tasks, and it would need to be different for mediawiki. I just used the idea to keep this message short to get your feedback.
Thanks
Jonathan
wikitech-l@lists.wikimedia.org