RE: [Foundation-l] Re: Information flow - Wikimedia-l

23 Aug 2005

Hello,

...
  You were able to solve the problem by adding hardware.
Whether 
 that is a victory or a failure is a matter of opinion. 
See, a community knows about new hardware more, because 
they pay for it. They do not pay developers and therefore do not 
know what software optimizations were introduced. And we did 
solve many problems by introducing software optimizations. 
So far, operation of Wikimedia cluster is a pure victory.

...
  Yes, blindly hacking away, disabling useful functions
at random, 
 instead of analyzing where the bottlenecks were.  It was very 
 embarrasing to watch.  The growth numbers in 2002-2003 for 
 susning.nu (which still runs on a single server) show what 
 software optimization can do, should you have chosen that path. 
What we have now is content produced by massive 
collaboration with a small export of it via our caches 
to the open world. 

Sir, we have profiling information. We know where our bottlenecks are. 
We deal with those bottlenecks. I know how to do software optimization 
for websites, though. Remove as much magic as possible, use very simplistic 
methods, ...We could serve whole website as HTML files from single 
Pentium4 box at gigabit speeds. We could drop accessibility and use 
lots of AJAX for all content loading and interaction. We could use MediaWiki 
as application server instead of script. There are still lots of challenges ahead.
There are ways to do things better. 

If you have any specific ideas on how to 'optimize' all that - feel free. 
Join into #mediawiki or #wikimedia-tech channels, discuss about profiler 
output with developers, brainstorm on how things can be done better.

I also have cases in my portfolio of 'growth numbers', but none of 
them reached the scale of Wikipedia, neither in load, neither in variety 
of users, neither in size of content. And we did have bigger fulltime 
development/operations teams, budgets as well.

Cheers,
Domas