Hi all,
as I wrote in an earlier post, I have compiled ZimReader for Openmoko platform. To make more databases available for the device I also had look at kiwix and the ZIM-Files supplied on it's site [1].
Some question arose from that and it appears, that this list is the right place to discuss them. Please correct me, if I'm wrong.
I observed that Kiwix is producing an "ad-hoc" type index. This may be usefull for desktops as they have the power to generate an index file on the fly. On small footprint devices this will not reasonably be possible, due to lacking memory and cpu resources.
Even on a dual core desktop with 3.5 GB of memory Kiwix failed to produce "ad-hoc" index of the openzim-edition of the German Wikipedia running out of memory after many hours.
Question 1:
From the change log I see that kiwix is using a
prominent search engine (Xapian) instead of the mechanism ZimReader/Writer are using. Is there an easy way to reuse an index produced by Kiwix on a different machines?
Question 2: Are there plans to enable Kiwix to read reusable indexes of the format released for ZimReader/ Writer?
Question 3: Are there plans to enable Kiwix to produce such a reusable index.
Question 4: Wouldn't it be desirable to deliver reusable indexes together with zim-article-databases for all those people with less capable devices (mids, netbooks, phones) on the Kiwix site?
Question 5: The zim databases supplied on the Kiwix site [1] seem to use the articles title field as article id field, which - I'm sure - solves some problems for Kiwix, but results in a list of article ids as result of a search on zimreader instead of a list of article titles. Since both Kiwix and ZimReader are part of the openzim standardization effort, this confuses me a bit. Which format is supposed to be the standard?
Question 6: I succeeded in producing a ZIM-Format index of the openzim-edition of the German Wikipedia using ZimWriter on the above 3.5 GB machine. Other than the index supplied on DVD, the generated index is 1. 5 GB of size (instead of 1.1 GB ). Any ideas why that is?
Cheers, Marc