How about using another algorithm that does this already?
Thanks for those who answered my mail. The thing is that I really really want my program to work natively with the dump files from Wikipedia, both because it would take me days to recompress 1,5 GB files, I would have no capacity to host them, etc. I am not just making this program to generate one offline CD for one language, but for it to be a tool that works with all dump files (right now I have the dump files of 9 smaller languages in a catalogue, all the interwiki links work etc - just slow)...
Therefore, if the people running the dumps would consider changing the format to something that was easier to random access, I would be all for it. Indeed I don't know the pros and cons that made them choose 7zip in the first place. However, I don't even know where to start a discussion with them, and I am imagining that such a decision would take very long to implement. Thus I figured trying to tweak 7zip would probably be a much faster way. :)
If you have any pointers on getting in touch with the dump people (I also tried pointing out on a talk page somewhere a week ago that on the static download page it says that dumps for December are currently in progress, and the link points to November, however, all the dumps for December are done according to the log, and you can download them if you type in the URL manually... that was a week ago and apparently that talk page wasn't a good way of getting in touch with them), or if some of them are hanging around here, I'd love to have a discussion with them, both about the dump format itself, and some other technical details on how they prepare the material. (I would also love to have several HTML dumps - one with only the article pages. Currently there is only one - which includes all pages, even the image detail pages). Let me say though, that they've done an awesome job, and there are some really neat decisions in how to make the static dumps.
Thanks a lot Stian