Hi,
Am 16.12.2010 06:50, schrieb Andrew Dunbar:
This is very interesting and I'll be watching it.
Where do the HTML
dumps come from? I'm pretty sure I've only seen "static" for Wikipedia
and not for Wiktionary for example. I am also looking at adapting the
parser for offline use to generate HTML from the dump file wikitext.
there are several ways to get the HTML data.
Emmanuel (Kelson) who is doing Kiwix and WP1.0 uses SQL dumps to set up
a seperate instance of MediaWiki to dump the data on the command line.
Ralf from Pediapress has written a wrapper for zimlib to use it with
Python, in order to integrate a ZIM export to the Collection Extension.
*
http://github.com/schmir/pyzim
A few weeks ago Tommi from openZIM has committed "wikizim" to the
zimwriter codebase, a command line tool that dumps a whole wiki into a
ZIM file using the MediaWiki API.
*
http://svn.openzim.org/viewvc.cgi/trunk/zimwriter/
There have been discussions with Roan Kattouw about how to use wikizim
on the Wikimedia wikis in the best way. His approach is to integrate the
relevant code from wikizim into the MediaWiki code to avoid abstraction
layers as much as possible in favour of performance and system load.
/Manuel
--
Regards
Manuel Schneider
Wikimedia CH - Verein zur Förderung Freien Wissens
Wikimedia CH - Association for the advancement of free knowledge
www.wikimedia.ch