Sj wrote:
===In the spirit of redundancy and availability===
A project suggestion:
Develop a completely separate and redundant project to maintain our
own fast, quickly-updated static mirror of Wikipedia content. We
could redirect visitors to this mirror, via DNS if necessary, in case
of any real catastrophe; it would also be useful when the primary site
is slow.
Advantages: it's redundant. Improves catastrophe protection.
Improves availability of data and discussions stored on-wiki. Further
search improvements? '''Could be handled entirely separate from core
cluster work; offloaded onto non-core devs.'''
Disadvantages: it's redundant. More work
An entirely separate set will waste resources that could be keeping the
site running well at peak access hours.
What we *should* do however is have the infrastructure ready to switch
over to a good, complete read-only mode as fast as possible when the
core database servers explode. Historically the databases have always
been the sticking point when recovering from 'interesting' problems.
These extra off-site datacenters we'll be setting up soon will fit into
that; they can have local copies of data which they could run read-only
from.
===In the spirit of failsafes and backups===
A system to periodically store entire database snapshots (every
month?), to recover from subtle, undetected database corruption. (my
impression was that this is not done already)
Space requirements for this would be on the order of several hundred
gigabytes per year. The days when I could back up all of Wikipedia onto
one CD-R every month are long gone. :)
-- brion vibber (brion @
pobox.com)