Dear ZIM hackers,
I recently improved the Kiwix HTTP software called kiwix-serve. Small
reminder: this software is a HTTP server able to deliver ZIM file
contents, so it acts as a Web server. Kiwix-serve has the new ability to
deal with many ZIM files at the same time (so with only one binary
instance). That means: you have on a same Web server contents belonging
to many different ZIM files. You have a demo here: http://library.kiwix.org
Both, ZIM files we make at Kiwix and ZIM files generated from Wikipedia,
have articles HTML with absolute internal URLs. That means, in the HTML
of articles, for a link pointing to the article "Wikipedia" (this is an
example), we will have a URL like "/A/Wikipedia" (or "/A/Wikipedia.html"
in my case, but this does not matter).
Until now, this was not a problem because we always had a "one by one"
usage of ZIM files: the context was clear. But now, in my case, I need
to specify with which ZIM file I want to deal. If I want to open the
"Wikipedia" article in WPEN, I should have something like that:
Here is the problem: I have HTML code with URLs looking like
"/A/Wikipedia.html" and I need something
"/wikipedia_en_all_nopic/A/Wikipedia.html". I have found a workaround by
rewriting on the fly the URLs but this is a ugly solution which is
absolutely not sustainable.
As far as I know, we do not have any specification relating to that.
To my opinion, absolute internals URLs should be forbidden. If we
continue with my example: "/wikipedia_en_all_nopic/A/Wikipedia.html" ;
"wikipedia_en_all_nopic" is something decided by the kiwix-serve
operator, not something that should be imposed by the ZIM publisher. So,
the publisher can not assumed what could/should be the full absolute
path, so should not use absolute paths for internal URLs. So, URLs
should be relatives and I only see two options (I continue with my
example): if you are in the same namespace, simply use "Wikipedia.html"
otherwise come back to the relative root of the file "../A/Wikipedia.html".
Before starting to fill a feature request for the Mediawiki:Collection
extension and patching my own ZIM generation scripts, I think we should
discuss and take a decision about that (and also update afterward the
specs. on the wiki). So I wait to your feedbacks.
I recently made a "book" via the PediaPress Book Creator prior to my
trip to India, and it has been delightful to use and read on the flight and
in my hotel room here. It had been awhile since I tried to make one, and I
wanted to say great work and good job to PediaPress! Also, the integration
with Kiwix was wonderful, and I love that it now shows up so seamlessly in
my "Library" within Kiwix.
I am not sure if you are aware, but in the recent Readership survey of
Wikipedia readers (from Sept 2011, which is only just now being analyzed),
the *number one request by readers was saving of articles for offline use
(as a PDF): *40% of readers said they would be MORE LIKELY to use Wikipedia
if such a service was available (note: this % is even higher in target
areas like India (50%) and Brazil (52%).* *This is fascinating, for it
shows that we (a) have a broader desire for offline content than just those
without Internet access, and (b) indicates there is great opportunity for
marketing the "Book Creator" tool.
I want to discuss the points needed to get to (b). The Book Creator tool is
great, and I think is the exact right type of tool to meet the needs of our
readers; but there is much room for improvements. Right now, I personally
find the experience getting to and from the Book Creator tool to be not as
straight forward as would be most beneficial. As this service has such a
huge demand, I think there are some opportunities for the refining of the
"book creator" tool and process. I'd love thoughts on the following and
- *Rebranding: *What are our thoughts on the title "Book Creator"? I
wonder if the title itself is a bit confusing, since people are apparently
unaware of the ability to download as PDF at all! Plus, I personally don't
utilize the tools as a means for creating an actual book, though I
recognize this was the initial purpose: I view it as a way to read a couple
specific articles offline. I think using the word "collection," which we do
informally anyway, is likely more appropriate here. Perhaps "Offline
Collection Creator" or "Article Aggregator" (both terrible ideas, I know,
but I'm just throwing things out there:))
- *Website placement: *I think it is obvious the space the Book Creator
takes on the Left Hand tool bar is not enough to draw attention to the
feature. I wonder if we should attempt to have some sort of a "Save for
Offline Use" button on each article, which would then open a new window
into the collection creator screen? This could look similar to the "Share
this" links which exist on most information websites (for Facebook,
Twitter, email, etc.). This could be next to the "Print" button.
- *Marketing: *Once we feel a bit more confident about usability, it
would be great to market the tool. We can do this in three phases:
- Phase 1: emails to different mailing lists announcing the project,
and asking for suggestions and feedback on the tools
- Phase 2: "pilot" testing of the tool, with banner advertising to
- Phase 3: advertise this functionality via a banner at the top of
- *Measurement*: clearly, we should have careful tracking of *books
created* and *downloads by file type* by day. @PediaPress: is this
I have some other ideas as well, but wanted to throw these out there for
some immediate reactions. What are people's thoughts? Any other ideas?
Anyone good with website design who could help with rearranging of the
"Book Creator"?? :)
Looking forward to the discussion (which should be moved onto a wiki soon) -