Hi,
Am 16.12.2010 06:50, schrieb Andrew Dunbar:
This is very interesting and I'll be watching it. Where do the HTML dumps come from? I'm pretty sure I've only seen "static" for Wikipedia and not for Wiktionary for example. I am also looking at adapting the parser for offline use to generate HTML from the dump file wikitext.
there are several ways to get the HTML data.
Emmanuel (Kelson) who is doing Kiwix and WP1.0 uses SQL dumps to set up a seperate instance of MediaWiki to dump the data on the command line.
Ralf from Pediapress has written a wrapper for zimlib to use it with Python, in order to integrate a ZIM export to the Collection Extension. * http://github.com/schmir/pyzim
A few weeks ago Tommi from openZIM has committed "wikizim" to the zimwriter codebase, a command line tool that dumps a whole wiki into a ZIM file using the MediaWiki API. * http://svn.openzim.org/viewvc.cgi/trunk/zimwriter/
There have been discussions with Roan Kattouw about how to use wikizim on the Wikimedia wikis in the best way. His approach is to integrate the relevant code from wikizim into the MediaWiki code to avoid abstraction layers as much as possible in favour of performance and system load.
/Manuel