For external uses like XML dumps integrating the
compression
strategy into LZMA would however be very attractive. This would also
benefit other users of LZMA compression like HBase.
For dumps or other uses, 7za -mx=3 / xz -3 is your best bet.
That has a 4 MB buffer, compression ratios within 15-25% of
current 7zip (or histzip), and goes at 30MB/s on my box,
which is still 8x faster than the status quo (going by a 1GB
benchmark).
Trying to get quick-and-dirty long-range matching into LZMA isn't
feasible for me personally and there may be inherent technical
difficulties. Still, I left a note on the 7-Zip boards as folks
suggested; feel free to add anything there:
https://sourceforge.net/p/sevenzip/discussion/45797/thread/73ed3ad7/
Thanks for the reply,
Randall
On Tue, Jan 21, 2014 at 2:19 PM, Randall Farmer <randall(a)wawd.com> wrote:
For external
uses like XML dumps integrating the compression
strategy into LZMA would however be very attractive. This would also
benefit other users of LZMA compression like HBase.
For dumps or other uses, 7za -mx=3 / xz -3 is your best bet.
That has a 4 MB buffer, compression ratios within 15-25% of
current 7zip (or histzip), and goes at 30MB/s on my box,
which is still 8x faster than the status quo (going by a 1GB
benchmark).
Re: trying to get long-range matching into LZMA, first, I
couldn't confidently hack on liblzma. Second, Igor might
not want to do anything as niche-specific as this (but who
knows!). Third, even with a faster matching strategy, the
LZMA *format* seems to require some intricate stuff (range
coding) that be a blocker to getting the ideal speeds
(honestly not sure).
In any case, I left a note on the 7-Zip boards as folks have
suggested:
https://sourceforge.net/p/sevenzip/discussion/45797/thread/73ed3ad7/
Thanks for the reply,
Randall