Jonathan wrote:
Is there a way analyze the wikipedia logs to figure
out what
processes takes the most time? There is no immediate need, but I
wanted to shoot off an idea to consider. If we were able to
capture the processes that do take a heavy server load and push
them onto a distributed process pool, would it help wikipedia or
mediawiki in general? I'd imagine there
This sounds like a nice theory, but what you need first is the
numbers. There are just sooo many more (100 times? 1000 times?)
normal page views than diffs, history views or edits. And the
normal page views are taken care of by caching proxies (Squid)
already.
I don't know what the numbers are today, or what the
hit-miss-ratio of the Squid cache is. It would be interesting to
know. Are these statistics documented anywhere?
Page requests per day hasn't been documented after October 2004,
http://stats.wikimedia.org/EN/TablesWikipediaEN.htm
--
Lars Aronsson (lars(a)aronsson.se)
Aronsson Datateknik -
http://aronsson.se