On Sat, Jan 28, 2012 at 09:56:13AM +0200, Ariel T. Glenn wrote:
The other thing about switching from one bzip2 implementation to another is that I rely on some specific properties of the bzip2 output (and its library) for integrity checking and for locating blocks in the middle of a dump when needed. I'd need to make sure my hacks still worked with the new output.
Sure. Integrity and compatibility should have topmost priority. I didn't take libbzip2 compatibility into account 1st. Maybe a viable way for us would be some way to detect reliably (any idea?) what has been used to compress the archive and to use bunzip2/pbunzip2 depending on that.
Parallel unpacking on the right format gives us a 4-5x speedup, whereas on the regular bzip2 archive there is no speedup, but ~6x CPU waste.
I had a look at the links sent with the packer comparison. It seems, that dbzip2 development is kind of dormant since 2008 and pbzip2 seems to be actively developed/maintained - maybe the pbzip2 devs would have an open ear for our wishlists? ;-)
regards,