Its amazingly simple what needs to be done:
I always fear and like simplicity at the same time :)
either
software wise:
make wikipedia less database reliant, namely, most of wikiepedia's CPU
time is being wasted on relational operations for ultimately simple
queries. EG, when you go to a page, it hits the Database, this is
clearly unnaccetable for a site of this scale. Simple solution is to
dump out current versions of articles to file for orders of
performance increase. And you could even have pre-prepared edit pages
as well. And im even told that the database is still being hit
for"recent-changes", whereas it should be dumped to file every 1-5
minutes.
The cache strategy is the most used nowadays in dynamic website, so to
say the commit mechanism could both add an article in the db and commit
a pure html version in a cache that would be accessed by everybody.
After, if there is a space disk constraint you can choose to cache the
more than 1000 hits in the last week page or something like that .
hardware wise:
[...]
Before buying harware, maybe conceiving the architecture could be a + !)
And last but not least, I guess it has already be done but there could
be some tuning on the servers :
- filesystem
- web server in itself
- db
- changing some transparent component,
and last but not least there could be a mirror strategy :
http://mirror1.fr.wikipedia.org => read only for instance and the edit
option would bring to the core server :)
--
Julien Tayon
http://www.tayon.net/ http://libroscope.org/
Si la vérité est une femme, essayons de la séduire avant de la saisir.