Long term plans for scalability - Wikitech-l

25 Nov 2002

I believe Wikipedia is being held back in terms of how many people can use
it and how it can grow, through architectural constraints.

The current architecture of one machine taking the entire burden of all
searches, updates and web page delivery inherently limits the rate at which
Wikipedia can grow.

In order for Wikipedia to grow, it needs an architecture which can easily
devolve work to other servers. A main database is still required to enforce
administrative policy and maintain database consistency.

Work to improve the speed of the database and reduce lag will, in the long
run, only be of very limited benefit and, perhaps, reduce the amount of lag
users experience for a few days or weeks.

A method of easily implementing mirror servers with live, real-time updates
is required. Each mirror server should cater for all the functionality
users expect from Wikipedia except for taking care of form submissions of
updates, which should be forwarded to the master wiki server.

The main database server should be released from the burden of serving web
pages and concentrate on running administrative code, processing and
posting database updates. 

The update system can be achieved by either:
	1) the main server creating SQL files of incremental changes to 
	be emailed to mirror servers, signed with a key pair, sequentially
	numbered to ensure they are automatically processed in order this 
	way, the server can run asynchronously with the mirrors which is
	better for reliability of the server. The server will not need to
	wait for connection responses from the mirror and updates will be
	cached in the mail system in the event that the mirror server be
	unavailable. (The main server will then only need to create one 
	email per update. The mail system infrastructure will take care of
	sending the data to each mirror. In fact, a system such as pipermail
	used on this list would solve the problem wonderfully. Mirror admins
	simply subscribe to the list to get all	updates sent to their machine
	and can manually download updates they are missing from the list!)

Or
	2) by the master server opening a connection directly to the SQL daemon
	on each remote machine. In which case the server will need to track what
	the mirrors have and have not received updates and need to wait for
time-out	on non-operational mirrors)(this system may open exploits on the
server	via the sql interface).