Aude schrieb:
I'm intrigued by the third server, which will keep copies of all media files. Right now, as I understand it, copies of images and media files on commons and elsewhere are not backed up. In September, 496 files were lost from commons, and the developers were asking around if anyone had backup copies and asking uploaders to re-upload. (http://www.nabble.com/Massive-image-loss-td19328360.html)
Will the new media file backup server address this sort of problem? Or being a live backup, will it suffer from the same issues/mistakes that affect the main Wikimedia servers?
I have been thinking about this too. The current idea is indeed to have a live mirror (based on ZFS replication, I think), so any accidental deletion will be mirrored too. It just protects against t data loss by hardware failure (think: fire in the data center).
In order to protect against accidental deletions, an intentionally lagged mirror would be useful. But I have no idea how to implement something like that.
A more conventional solution would be to have a two more copies of the files, on the same server, which are synced every, say, 24 hours: backup a -> backup b, live mirror -> back a. But this would require three times the space. Considering we have 5TB worth of media files currenlty (does this include thumbnails?), and the new server will have 24TB of space, this could work for a while. But taking into account exponential growth, it wouldn't last long.
Tripling space requirements seems a bit of overkill. Maybe there's a smarter solution. Ideas?
--daniel