On 5/6/05, Brion Vibber <brion(a)pobox.com> wrote:
Sj wrote:
===In the spirit of redundancy and
availability===
A project suggestion:
Develop a completely separate and redundant project to maintain our
own fast, quickly-updated static mirror of Wikipedia content. We
An entirely separate set will waste resources that could be keeping the
site running well at peak access hours.
True. The part I'm keen on is finding projects that can be offloaded
to groups of devs who aren't trusted enough to futz around in the core
cluster... Maybe this can be done in a way that wastes neither
hardware nor developer resources.
===In the
spirit of failsafes and backups===
A system to periodically store entire database snapshots (every
month?), to recover from subtle, undetected database corruption. (my
impression was that this is not done already)
Space requirements for this would be on the order of several hundred
gigabytes per year. The days when I could back up all of Wikipedia onto
one CD-R every month are long gone. :)
Yes. I'm thinking perhaps a single hard drive for every couple
snapshots; 60GB now, and growing as the projects grow. That also
offers distributed protection against media failure...
or we could write everything onto tapes or optical disks for
longer-term storage. a TB a year isn't prohibitive.
--
+sj+