Hi,
I've some ideas concerning libzim, which I would like to share with you.
There is a class zim::Files, which represents a set of zim-files in one directory. I would like to drop that class, if it is not used in Kiwix. (Emmanuel: Can you verify that?)
Let me explain, why the class is there, why I would like to remove it and how to replace the functionality.
In the original german wikipedia-DVD the content of wikipedia was in one zeno- file and the word index, which is also a zeno-file was in another. Images where there in a very bad quality due to limited capacity of the DVD. There was a compact and premium edition. The premium edition had images in a much better quality on 3 additional DVDs. Each had a zeno file with part of the images. To use these better quality images, the user had to copy all zeno files into one directory on his hard drive and configure the reader to use that directory. When images were requested, the reader then searches the image in all zeno files and fetches the one with the best quality.
I feel, that this solution is not optimal. There is also a technical reason, why this is not that good and since I have a better idea (at least I think the idea is better ;-) ), I would like to drop that feature. The technical reason is simple: there is no common API to read directories. Almost all operating systems has opendir/readdir/closedir. Unfortunately reactos (and other systmes, which use win32) do not have these functions. So the zimlib has to use a different API for these systems. This is not really a big problem but still a problem. It is actually the only code, which is OS specific in zimlib (or actually in the used cxxtools, which provides a wrapper for these functions).
The functionality is not needed on the planned wikipedia DVD for the linux tag, since we won't have a premium edition but have all data in one single zim file.
In the future I plan to create a utility, which merges the content of 2 or more zim files into one. The structure of zim files makes this a quite easy operation. Much easier than creating new zim files. Especially the changes I made compared to zeno makes it very easy and fast. Combining 2 zim files is almost as cheap as copying these files. So for users instead of copying all zim files into one directory, they can just combine multiple zim files to creat one single big file.
The utility will have an option to control how to handle duplicate articles. The utility may prefer the articles of one of the files, so it will be possible to make update files, which just provides the changed files. The resulting combined file will not be exactly the same as a new file, since it will have empty blob entries for removed article data. But the user won't see any difference.
Tommi