Redundancy for wikipedia - Wikitech-l

9 Apr 2006


      Hi all,
I guess you probably know that Wikipedia was down earlier, due to a
power fault at the colo centre (apparently). I was chatting to brion on
IRC and he recommended that I contact this list.
I think it should be possible to make Wikipedia fully redundant to
outages of individual data centres, and not too expensive. Here's how.
Get a BGP portable IP address range. Advertise this range from TWO
locations, at separate data centres. Have basically identical read-only
servers on each range, with the same IP addresses. Don't worry about IP
conflicts, as the servers are identical, and the shortest route from any
given client will point to just one data centre, and not move unless
that data centre goes down, when it will automatically fall back to the
other.
Under normal conditions, your load is shared between both data centres,
so you don't need to actually increase the number of servers. If one
goes down, all requests go to the other, so performance might drop, but
Wikipedia should stay up.
This only works for read-only servers, so the process of editing
Wikipedia would still rely on one of the groups (or some subset of
servers in that group) being masters, and all the other servers being
slaves that sync off those masters.
It's just a suggestion, I'd be interested to hear what you think.
If you are interested, I know a hosting company that has a BGP-portable
range (I used to work for them), and I could talk to them about whether
they can set up redundant IP tunnelling for that range to whatever IP
addresses (VPN endpoints) you want, so you wouldn't even need to have
your own BGP range.
Cheers, Chris.
-- 
  ___ __     _
 / __/ / ,__(_)_  | Chris Wilson <0000 at qwirx.com> - Cambs UK |
/ (_/ ,/ _/ /_ \ | Security/C/C++/Java/Perl/SQL/HTML Developer |
\ _/_/_/_//_/___/ | We are GNU-free your mind-and your software |