On Tue, 10 Sep 2002 14:09:33 -0700 Ray Saintonge saintonge@telus.net wrote:
<information suggesting a non-linear growth curve for Wikipedia>
I have seen messages talking about changing the database engine from MySql to postgreSQL to fix table locking problems on a busy system.
I am concerned that this _type_ of engineering work may not be what is really needed.
My contention is that Wikipedia load can grow at an exponential rate but may be constrained by resource availablility. There are many factors which cause self-multiplication.
Decisions which need to be made:
1) Do we want Wikipedia to be _able_ to grow at an exponential rate? If yes: a) We need to consider a technical system which can be put in place to distribute load such that no one system needs to handle all the load b) Consider whether the current social system of regulation can scale to meet demand and monitor this c) Keep a conscious review open to ensure the quality of Wikipedia with such an exponential growth and consider adding constraints to growth if such a growth rate starts causing undesirable effects.
If no: a) Consider how availability of the system will be limited in order to prevent exponential growth, and at what rate, if any, availability is extended. b) What parts of the system are best rationed to limit growth rate. ie should searches, page views or edits be limited?
From my experience at using the system the last few days, I percieve there
is currently a technical constraint limiting the rate of growth. This may be desirable, this may be undesirable. Do we know which it is? Has an explicit decision been made?
A scalable solution is to give nearly all responsibility for all wiki functionality to mirror servers. Updates are posted directly to the main Wiki server which in turn posts the database updates to registered first tier mirrors which, in turn, can post database updates to second tier mirrors registered with them and so on. This way, all mirrors can be kept in sync in near real time with a minimum of CPU, memory and network load. The main server then need do nothing other than maintain database consistency, accept and post updates.