Brion Vibber wrote:
More importantly, not every decompressor will
decompress concatenated
streams. Dictating which decoder end-users should use is not cool. :)
The reference bzip2 tool has supported it for ages.
I added support for concatenated bzip2 files to php bz2 on September. It
is only supported on the newer php versions. (Oddly, importDump.php
doesn't seem to be supporting bzipped dumps)
Don't know about java/mwdumper support.
** 2. If
dump.bz2 was single-block, many-stream (as opposed to the current
many-block, single-stream), then people on the importing end could speed
up *decompression* with pbzip2.
Lack of compatibility with other tools makes this format undesirable;
further note that a smarter decompressor could act as bzip2recover does
to estimate block boundaries and decompress them speculatively. In the
rare case of an incorrect match, you've only lost one to two blocks'
worth of time.
Support of those other tools for streams add quite a complexity for
decompressors wanting to decompress only a block (due to the
byte-unaligned nature of blocks).
I never got round to completing the decompressor
implementation for
dbzip2, though..
The code at
http://svn.wikimedia.org/viewvc/mediawiki/trunk/dbzip2/ ?