Hi Stian,
I have been playing yesterday with your tool. In my experiments i found the access time for a page is almost constant, being the cost of uncompressing the index.
As you launch the program everytime, it needs to uncompress, store and read the index once and again.
I modified a bit 7z_C to accept the same parameters as 7z, and found that it takes 4.2 seconds, against 7-8.2 seconds of 7z, 6.2 vs 7.2...
It is a big difference, but still too slow. It may have to do with the program being less general: Code more specific + Less filesize. As the program has to be loaded by the OS a lot, the smaller, the better.
My original idea was modifying it so it could have the file index mmapped. But taking into account it has lots of pointers, now i'm considering making it a full server (dropping the ruby part) so it can have the 7z data in memory.
Opinions?