Thanks to Emmanuel and others for their work in making Kiwix even better!
We need to work on updating content, as suggested here:
I would really like to work on preparing our next release of the English
Wikipedia (called Version 0.9), but I need some technical help in
preparing the collection. User:CBM has done this in the past, but he is
unable to do this now. Can anyone here help out? I believe that some
knowledge of Perl and the WP:1.0 bot may be needed. CBM has offered to
help in teaching a new person.
Martin A. Walker
Department of Chemistry
SUNY College at Potsdam
Potsdam, NY 13676 USA
+1 (315) 267-2271
The second release candidate of Kiwix 0.9 (Kiwix 0.9 rc2) is now online!
The most important improvements are:
* FIXED: kiwix-install crashes
* FIXED: kiwix-serve memory leak on ARM (ID: 3538663)
* FIXED: Open link in new tab from the right-click contextual menu (ID:
* FIXED: Problems in creating PDF/HTML from tabs (#362)
* FIXED: Kiwix crashes on MS/Windows with "special" files (#317)
* FIXED: Indexing progression computation (more accurate)
* FIXED: Broken suggestion/search textbox with “syllable input mode”
* FIXED: Internet Explorer compatibility issues with kiwix-serve.
* FIXED: Sporadically dying (serving big contents) kiwix-serve (#371)
* FIXED: kiwix-* console tools do not start on OSX
* NEW: New supported user interface languages: Zazaki (diq), Javanese
(jv), Khmer (km), Urdu (ur), Burmese (my), South Azerbaijani (azb)
* NEW: "Inverted colors" feature (ID: 3551975)
* NEW: Start Kiwix from the console with -search cmd line argument (ID:
* NEW: Search suggestions in kiwix-serve (ID: 2945983)
* NEW: Desktop shortcut creation in Windows installer
* NEW: kiwix-serve (HTTP server) for MS/Windows (#140)
* NEW: Persistent bookmark set over sessions (#188)
Detailed changelog may be found at:
The most important improvement is the integration/port of kiwix-serve
(our ZIM HTTP server) in the Kiwix UI for Windows, OSX and Linux. That
means that everybody is now able in a few clicks to share Wikipedia on
his LAN. We believe this could be a really practical feature, for
example in schools.
* Software at https://sourceforge.net/projects/kiwix/files/0.9_rc2/
* Content at http://www.kiwix.org
Report bugs and request features at:
Although Kiwix is already really stable, we think we will need one or
two release candidates more. We want to improve a little bit the
file/profile/index managements, fix the broken CLucene fulltext backend
and a few other things.
Stay tuned at:
Let me address your question, on how to add images to your local copy
of enwiki, in part 4 below. But some preliminaries might be in order:
1) ENWIKI. Building a mirror of <http://en.wikipedia.org/> is the
most demanding case. Details on how to do so, can be found in the
WP-MIRROR Reference Manual
enwiki is still growing, so if you decide to download all the images,
then I would suggest purchasing 3T hard drives, rather than 2T.
2) SIMPLEWIKI. If you have not already done so, you may wish to take
a look at <http://simple.wikipedia.org/> to see if mirroring the
simplewiki meets your needs. Simple English means shorter sentences
and is intended for those who learned English as a Second Language
(ESL). The most recent dump file
has 123k articles, 66k images, and occupies 60G of hard disk space.
3) WP-MIRROR. Wp-mirror is a mirror building utility. It uses
MediaWiki and imports the articles into databases managed by MySQL.
It downloads full size images. By default, it builds a mirror of the
`simple' wikipedia, but it can be configured for any set of
wikipedias. It works `out-of-the-box' for GNU/Linux distributions:
Debian 7.0 (wheezy) and Ubuntu 12.10 (quantal). Home page
As you mentioned that you are `currently running a WAMP solution', I
should point out that WP-MIRROR has not been ported to Windows.
And now to your question:
4) IMAGES. There is a way to download SOME of the images (rather than
all) for the enwiki. Wp-mirror, as part of its duties: 1) splits the
dump file into chunks (x-chunks) of 1000 pages each, 2) scrapes each
x-chunk to find image file names, and 3) generates a shell script
(i-chunk) for downloading the image files referenced in the
corresponding x-chunk. This means that you can run just the i-chunks
that you want.
Example: Running the first 100 i-chunks would download the images for
the first 100,000 pages, which are the pages that are the oldest,
largest, and most decorated with images.
So the method might work as follows: 1) install wp-mirror on a laptop
with wheezy or quantal, 2) configure it for enwiki, 3) run it just
long enough to generate the i-chunks, 4) abort wp-mirror, 5) run the
desired i-chunks manually, and 6) move the images over to your WAMP
5) PERFORMANCE. At the suggestion of Jason Skomorowski, the next
version of wp-mirror will have a number of performance enhancements.
In particular, the i-chunks will make use of HTTP/1.1 persistent
connections [RFC2616]. If you are in a hurry, wp-mirror 0.5 should
work fine; but, if you can wait a month or so, version 0.6, when
released, will download image files with far less latency.
Let me know if I can be of any help.
Please help us to test new release!
-------- Original Message --------
Subject: Kiwix 0.9 RC2 pre-release
Date: Thu, 20 Dec 2012 17:01:34 +0100
From: Emmanuel Engelhart <kelson(a)kiwix.org>
CC: Mailing list for Wikimedia CH <wikimediach-l(a)lists.wikimedia.org>
Dear Kiwix testers
After 5 months of work, we have finished the new release candidate
(RC3) version of Kiwix 0.9. We plan to release it officially on Monday,
but wanted to give us a few days, with your help, to get a chance to
detect critical bugs and regressions.
All binaries are available here:
You may find the whole changelog there:
The most essential features to test *on all OSes* are:
* kiwix-serve (HTTP server) integration in UI (menu tools>server)
* kiwix-* console tools (particularly kiwix-serve)
Thank you in advance for your feedbacks
I have a couple quick questions that I couldn't really find the answer to
anywhere online. I'm trying to find the best way to have my local copy of
en-wiki be able to have images.
1. The image tarball from your.org for en-wiki (
is *around 2 terrabytes total!* Thats way too big for me.
2. I'm also considering using *Wikix*, but I couldn't find any recent
references to the size of images that would download (everything i found
3. And lastly, I'm wondering if there is a way to turn a .ZIM file back
into database format to be able to import into mysql (I'm currently running
WAMP server solution).
The big thing here is I *do not need all of the text of en-wiki* (hence,
the .zim consideration) *NOR all of the images*. I want to do this in the
most economical way possible. Does anyone have any suggestions?
We have revamped the audience measurement system of Kiwix. We provide
now and easy and practical tool for people who want detailed and
accurate data about the offline Wikipedia downloads. Everything is
It's important to notice that you have two sites (there is a selector at
the top-right corner):
* www.kiwix.org for the web visitors
* download.kiwix.org for the file (ZIM, ZIP, ...) downloads
For now, we do not have so much historic data because the tool is new,
but you can already use it and have a look to the data of the last days.
We can especially see the impact of putting a notice on Wikipedia
(currently on the WPAR and WPFA) to advert the existence of Wikipedia
I have also made a tech. blog post explaining how it works:
Dear list members,
I am pleased to announce the release of WP-MIRROR 0.5.
The main design objective was this: WP-MIRROR 0.5 should be `feature
complete'. It is now deemed beta-ware. Examples of new features are:
Concurrency: Adding, building, and dropping wikipedias from the
mirror can now we done concurrently. The main use case is this:
During the build of the default `simple' wikipedia, one may add
additional wikipedias to the mirror.
Images: WP-MIRROR 0.5 does a more thorough job of identifying image
files to be downloaded.
Interwiki links: If the mirror has two or more wikipedias, then
Interlanguage links now appear in the Navigation Bar. That is, when
browsing an article in one wikipedia, the link to the corresponding
article in the other wikipedia appears in the Navigation Bar, so that
the user can readily switch between the two.
Packaging: The DEB package for WP-MIRROR 0.5 should work
`out-of-the-box' with no user configuration for the following
o Debian GNU/Linux 7.0 (wheezy)
o Ubuntu 12.10 (quantal)
Virtual Host: URL has been changed to <http://simple.wpmirror.site/>.
Consistency of naming was the motivation.
The web page <http://www.nongnu.org/wp-mirror/> has been updated.
Feedback is welcome.
Dr. Kent L. Miller