On 16 March 2010 21:43, Tim Starling <tstarling(a)wikimedia.org> wrote:
About 40% of our text storage has been recompressed
into
DiffHistoryBlob format, which uses a combination of binary diffs and
gzip to reduce storage space.
Approximately 1.9TB of text storage, mostly revisions compressed
individually with gzip, was recompressed to about 140GB, a saving of 93%.
Revisions were compressed individually? I thought they were
concatenated and then compressed to take advantage of revisions of the
same article usually only differing by small amounts (and so being
highly compressible). I'm sure brion said that sometime...