I believe Wikipedia is being held back in terms of how many people can use it and how it can grow, through architectural constraints.
The current architecture of one machine taking the entire burden of all searches, updates and web page delivery inherently limits the rate at which Wikipedia can grow.
In order for Wikipedia to grow, it needs an architecture which can easily devolve work to other servers. A main database is still required to enforce administrative policy and maintain database consistency.
Work to improve the speed of the database and reduce lag will, in the long run, only be of very limited benefit and, perhaps, reduce the amount of lag users experience for a few days or weeks.
A method of easily implementing mirror servers with live, real-time updates is required. Each mirror server should cater for all the functionality users expect from Wikipedia except for taking care of form submissions of updates, which should be forwarded to the master wiki server.
The main database server should be released from the burden of serving web pages and concentrate on running administrative code, processing and posting database updates.
The update system can be achieved by either: 1) the main server creating SQL files of incremental changes to be emailed to mirror servers, signed with a key pair, sequentially numbered to ensure they are automatically processed in order this way, the server can run asynchronously with the mirrors which is better for reliability of the server. The server will not need to wait for connection responses from the mirror and updates will be cached in the mail system in the event that the mirror server be unavailable. (The main server will then only need to create one email per update. The mail system infrastructure will take care of sending the data to each mirror. In fact, a system such as pipermail used on this list would solve the problem wonderfully. Mirror admins simply subscribe to the list to get all updates sent to their machine and can manually download updates they are missing from the list!)
Or 2) by the master server opening a connection directly to the SQL daemon on each remote machine. In which case the server will need to track what the mirrors have and have not received updates and need to wait for time-out on non-operational mirrors)(this system may open exploits on the server via the sql interface).
wikitech-l@lists.wikimedia.org