On 02/06/11 19:52, George Herbert wrote:
On Thu, Jun 2, 2011 at 10:55 AM, David Gerarddgerard@gmail.com wrote:
On 2 June 2011 18:48, Faefaenwp@gmail.com wrote:
In 2016 San Francisco has a major earthquake and the servers and operational facilities for the WMF are damaged beyond repair. The emergency hot switchover to Hong Kong is delayed due to an ongoing DoS attack from Eastern European countries. The switchover eventually appears successful and data is synchronized with Hong Kong for the next 3 weeks. At the end of 3 weeks, with a massive raft of escalating complaints about images disappearing, it is realized that this is a result of local data caches expiring. The DoS attack covered the tracks of a passive data worm that only activates during back-up cycles and the loss is irrecoverable due backups aged over 2 weeks being automatically deleted. Due to no archive strategy it is estimated that the majority of digital assets have been permanently lost and estimates for 60% partial reconstruction from remaining cache snapshots and independent global archive sites run to over 2 years of work.
This sort of scenario is why some of us have a thing about the backups :-)
(Is there a good image backup of Commons and of the larger wikis, and
- and this one may be trickier - has anyone ever downloaded said
backups?)
- d.
I've floated this to Erik a couple of times, but if the Foundation would like an IT disaster response / business continuity audit, I can do those.
Tape is -- still -- your friend here. Flip the write-protect after writing, have two sets of off-site tapes, one copy of each in each of two secure and widely separated off-site locations run by two different organizations, and you're sorted.
Tape is the dumb backstop that will keep the data even when your supposedly infallible replicated and redundant systems fail. For example, it got Google out of a hole quite recently when they had to restore a significant number of Gmail accounts from tape. (see http://www.talkincloud.com/the-solution-to-the-gmail-glitch-tape-backup/ )
And, unlike other long-term storage media, there is a long history of tape storage, an understanding of its practical lifespan and risks, and well-understood procedures for making and verifying duplicate sub-master copies to new tape technologies over time to extend archive life, etc. etc.
If we say that Wikimedia Commons currently has ~10M images, and if allow 1Mbyte per image, that's only 10 TB: that will fit nicely on seven LTO5 tapes. If you use LTFS, you can also make data access and long-term data robustness easier. If you like, you can slip in a complete dump of the Mediawiki source and Commons database on each tape, as well.
Even if I'm wrong by an order of magnitude, and 140 tapes are needed, instead of 14, that's still less than $10k of media -- and I wouldn't be surprised if tape storage companies wouldn't be eager to vie to be the company that can claim it donates the media and drives which provide Wikipedia's long-term backup system.
With two tape drives being run at once at an optimal 140 MB/s each, the whole backup would take less than a day. Even if I was wrong about both the writing speed and archive size by an order of magnitude each, this would still be less than three months.
The same tape systems could also, trivally, be used to back up all the other WMF sites, on similar lines.
-- Neil