> For external uses like XML dumps integrating the compression
> strategy into LZMA would however be very attractive. This would also
> benefit other users of LZMA compression like HBase.
For dumps or other uses, 7za -mx=3 / xz -3 is your best bet.
That has a 4 MB buffer, compression ratios within 15-25% of
current 7zip (or histzip), and goes at 30MB/s on my box,
which is still 8x faster than the status quo (going by a 1GB
benchmark).
Trying to get quick-and-dirty long-range matching into LZMA isn't
feasible for me personally and there may be inherent technical
difficulties. Still, I left a note on the 7-Zip boards as folks
suggested; feel free to add anything there:
https://sourceforge.net/p/sevenzip/discussion/45797/thread/73ed3ad7/
Thanks for the reply,
Randall
> For external uses like XML dumps integrating the compression > strategy into LZMA would however be very attractive. This would also > benefit other users of LZMA compression like HBase. For dumps or other uses, 7za -mx=3 / xz -3 is your best bet. That has a 4 MB buffer, compression ratios within 15-25% of current 7zip (or histzip), and goes at 30MB/s on my box, which is still 8x faster than the status quo (going by a 1GB benchmark). Re: trying to get long-range matching into LZMA, first, I couldn't confidently hack on liblzma. Second, Igor might not want to do anything as niche-specific as this (but who knows!). Third, even with a faster matching strategy, the LZMA *format* seems to require some intricate stuff (range coding) that be a blocker to getting the ideal speeds (honestly not sure). In any case, I left a note on the 7-Zip boards as folks have suggested: https://sourceforge.net/p/sevenzip/discussion/45797/thread/73ed3ad7/ Thanks for the reply, Randall