> For external uses like XML dumps integrating the compression 
> strategy into LZMA would however be very attractive. This would also
> benefit other users of LZMA compression like HBase.

For dumps or other uses, 7za -mx=3 / xz -3 is your best bet.

That has a 4 MB buffer, compression ratios within 15-25% of 
current 7zip (or histzip), and goes at 30MB/s on my box, 
which is still 8x faster than the status quo (going by a 1GB 
benchmark).

Trying to get quick-and-dirty long-range matching into LZMA isn't 
feasible for me personally and there may be inherent technical 
difficulties. Still, I left a note on the 7-Zip boards as folks
suggested; feel free to add anything there: 
https://sourceforge.net/p/sevenzip/discussion/45797/thread/73ed3ad7/

Thanks for the reply,
Randall


On Tue, Jan 21, 2014 at 2:19 PM, Randall Farmer <randall@wawd.com> wrote:
> For external uses like XML dumps integrating the compression 
> strategy into LZMA would however be very attractive. This would also
> benefit other users of LZMA compression like HBase.

For dumps or other uses, 7za -mx=3 / xz -3 is your best bet.

That has a 4 MB buffer, compression ratios within 15-25% of 
current 7zip (or histzip), and goes at 30MB/s on my box, 
which is still 8x faster than the status quo (going by a 1GB 
benchmark).

Re: trying to get long-range matching into LZMA, first, I 
couldn't confidently hack on liblzma. Second, Igor might 
not want to do anything as niche-specific as this (but who 
knows!). Third, even with a faster matching strategy, the 
LZMA *format* seems to require some intricate stuff (range 
coding) that be a blocker to getting the ideal speeds 
(honestly not sure). 

In any case, I left a note on the 7-Zip boards as folks have 
suggested: 
https://sourceforge.net/p/sevenzip/discussion/45797/thread/73ed3ad7/

Thanks for the reply,
Randall