Hi
For the first time, we have achieved to release a complete dump of all encyclopedic articles of the Wikipedia in English, *with thumbnails*.
This ZIM file is 40 GB big and contains the current 4.5 million articles with their 3.5 millions pictures: http://download.kiwix.org/zim/wikipedia_en_all.zim.torrent
This ZIM file is directly and easily usable on many types of devices like Android smartphones and Win/OSX/Linux PCs with Kiwix, or Symbian with Wikionboard.
You don't need modern computers with big CPUs. You can for example create a (read-only) Wikipedia mirror on a RaspberryPi for ~100USD by using our ZIM dedicated Web server called kiwix-serve. A demo is available here: http://library.kiwix.org/wikipedia_en_all/
Like always, we also provide a packaged version (for the main PC systems) which includes fulltext search index+ZIM file+binaries: http://download.kiwix.org/portable/wikipedia_en_all.zip.torrent
What is interesting too: This file was generated in less than 2 weeks thanks to multiples recent innovations: * The Parsoid (cluster), which gives us an HTML output with additional semantic RDF tags * mwoffliner, a nodejs script able to dumps pages based on the Mediawiki API (and Parsoid API) * zimwriterfs, a solution able to compile any local HTML directory to a ZIM file
We have now an efficient way to generate new ZIM files. Consequently, we will work to industrialize and automatize the ZIM file generation process, one thing which is probably the most oldest and important problem we still face at Kiwix.
All this would not have been possible without the support: * Wikimedia CH and the "ZIM autobuild" project * Wikimedia France and the Afripedia project * Gwicke from the WMF Parsoid dev team.
BTW, we need additional developer helps with javascript/nodejs skills to fix a few issues on mwoffliner: * Recreate the "table of content" based on the HTML DOM (*) * Scrape Mediawiki Resourceloader in a manner it will continue to work offline (***) * Scrape categories (**) * Localized the script (*) * Improve the global performance by introducing usage of workers (**) * Create nodezim, the libzim nodejs binding and use it (***, need also compilation and C++ skills) * Evaluate necessary work to merge mwoffliner and new WMF PDF Renderer (***)
Emmanuel