Hello!
What script would you recommend to create a static offline version of
a mediawiki? (Perhaps with and without parsoid?)
I've been looking for a good solution for ages, and have experimented
with a few things. Here's what we currently do. It's not perfect, and
really a bit too cumbersome, but it works as a proof of concept.
To illustrate: E.g. one of our wiki pages is here:
http://orbit.educ.cam.ac.uk/wiki/OER4Schools/What_is_interactive_teaching
We have a "mirror" script, that uses the API to generate an HTML
version of a wiki page (which is then 'wrapped' in a basic menu):
http://orbit.educ.cam.ac.uk/orbit_mirror/index.php?page=OER4Schools/What_is…
(Some log info printed at the bottom of the page, which will provide
some hints as to what is going on.)
The resulting page is as low-bandwidth as possible (which is one of
our use cases). The original idea with the mirror php script was that
you could run it on your own server: It only requests pages if they
have changed, and keeps a cache, which allows viewing pages if your
server has no connectivity. (You could of course use a cache anyway,
and there's advantages/disadvantages compared to this more explicit
caching method.) The script rewrites urls so that normal page links
stay within the mirror, but links for editing and history point back
at the wiki (see tabs along the top of the page).
The mirror script also produces (and caches) a static web page, see here:
http://orbit.educ.cam.ac.uk/orbit_mirror/site/OER4Schools%252FHow_to_run_wo…
Assuming that you've run a wget across the mirror, then the site will
be completely mirrored in '/site'. You can then tar up '/site' and
distribute it alongside your w/images directory, and you have a static
copy, or use rsync to incrementally update '/site' and w/images on
another server.
There's also a api-based process, that can work out which pages have
changes, and refreshes the mirror accordingly.
Most of what I am using is in the mediawiki software already (i.e.
API->html), and it would be great to have a solution like this, that
could generate an offline site on the fly. Perhaps one could add
another export format to the API, and then an extension could generate
the offline site and keep it up to date as pages on the main wiki are
changing. Does this make sense? Would anybody be up for collaborating
on implementing this? Are there better things in the pipeline?
I can see why you perhaps wouldn't want it for one of the major
wikimedia sites, or why it might be inefficient somehow. But for our
use cases, for a small-ish wiki, with a set of poorly connected users
across the digital divide, it would be fantastic.
So - what are your solutions for creating a static offline copy of a mediawiki?
Looking forward to hearing about it!
Bjoern
Hi offline community,
how important is ZIM support in Collections (the "Create a book"
feature) on Wikimedia sites? We implemented this a while ago to
support offline efforts. Since collections are still typically very
much limited in size, it's not a very viable option for huge offline
exports, more for batches of articles on related topics. Do people
currently rely on this functionality for offline deployments?
We're re-implementing the rendering pipeline for Collections to ensure
long-term maintainability, and our default would be to eliminate
initially all formats except for PDF if we don't absolutely have to
support them. I'll see if we can get some metrics on current ZIM file
usage via the Collection extension, but it'd be nice to get
qualitative feedback as well.
(More background at: https://www.mediawiki.org/wiki/PDF_rendering )
Thanks,
Erik
--
Erik Möller
VP of Engineering and Product Development, Wikimedia Foundation
Hi,
Google Code-in program has started yesterday. Kiwix, with other
Wikimedia FOSS projects, has proposed a few "small" (few days work) tasks.
Our tasks focus on the improvement of the new Kiwix for Android app.
Young and talented developers with skills in compilation, Java,
Javascript are more than welcome.
To have a look to our tasks, click on that link and add "Kiwix" to the
"Tags" filter:
https://google-melange.appspot.com/gci/tasks/google/gci2013
Feel free to ask me per email our on IRC channel (#kiwix on Freenode) if
you have questions or simply need help.
Regards
Emmanuel
--
Kiwix - Wikipedia Offline & more
* Web: http://www.kiwix.org
* Twitter: https://twitter.com/KiwixOffline
* more: http://www.kiwix.org/wiki/Communication
Hi Emmanual,
Thanks for the message!
> Kiwix-serve is able to serve any ZIM file:
> http://www.kiwix.org/wiki/Kiwix-serve
That's great!
>> '''Scenario 3: Use on tablets / phones.''' We would love to have a
> We have Kiwix for Android which is able to open any ZIM file:
> https://play.google.com/store/apps/details?id=org.kiwix.kiwixmobile
Yes - I saw the post about the new version, and it's very exciting.
> We want to develop a version for iOS, but for now there no concrete agenda:
> http://www.kiwix.org/wiki/IOS
This would of course be good to have, but for our scenarios it won't
be needed, as iPhones/iPads are not very common in most places in
Africa. Android devices are far more competitively priced, so if
people have a smart phone or tablet, it's usually Android. (Though in
some places Blackberry is still quite common.)
>> * '''Offline access:''' We'd love to have some advice how we can
> We have a still in dev, but already working solution for ZIM incremental
> update. This should be available for users in a few months. But, as far
> as I can see, you mediawiki is not too big, so the ZIM file shouldn't be
> too big to.
Well, we have a lot of audio and video. Some of it (the
audio/images/files) are wiki uploads, but the video is kept separately
(and YouTube, and on our server, so that it can be mirrored for local
use). Just the uploads, i.e. the 'w/images' directory, alone is 1.2GB
at the moment, so it's pretty substantial. So we might want to 'side
load' the video content, if that makes sense, and would definitely
need incremental updates for 'w/images'. The wiki text is not massive:
the number of articles is about 1,500.
> The real problem is the ZIM file generation. The future solution based
> on Parsoid should allow you (and anyone if your wiki is public) to build
> easily a ZIM file of it. For now we need to fix things on Parsoid and
> Kiwix side before having a perfectly usable solution.
Ok, so the use of Parsoid also means that the html5 generated is used
for the ZIM file, so any <div>s we put in (for boxes) will appear in
the output?
We are using Widgets (e.g. for YouTube), so would this also work with
Kiwix? (By embedding the video from YouTube, same as on our web site?)
> But, if you achieve to get a dev instance of Parsoid working onr your
> wiki, I would be happy to try to build a ZIM file for you:
> https://www.mediawiki.org/wiki/Parsoid
That would be great - many thanks! I'll see what I can do in terms of
getting Parsoid working. We're on a virtual server, that I don't have
su rights for, but I'll see what I can do.
Many thanks,
Bjoern
Hi,
In the last 10 days, we have done two new releases for Android smartphones:
v1.5
* Fix regression in suggestions (impacting non-latin alphabets)
* Speed-up a little bit the "night mode"
* Fix color issue with searchbox (impacting only a few devices)
* Show software version in "options"
* Multiple other small UI improvements
v1.4
* Localization of the menu/buttons/settings
* Mask "Search in text" menu item if necessary
* Buttons re-ordering
* Allow multitouch zoom if zoom button are configured "invisible"
* Add "Random article" button
* Fix buggy open procedure with filenames with "special characters"
* Avoid forcing first letter uppercase in search box
Kiwix for Android continues its progression and has now more than 5.000
active users.
Enjoy:
https://play.google.com/store/apps/details?id=org.kiwix.kiwixmobile
Regards
Emmanuel
--
Kiwix - Wikipedia Offline & more
* Web: http://www.kiwix.org
* Twitter: https://twitter.com/KiwixOffline
* more: http://www.kiwix.org/wiki/Communication
Yannick Guigui, 08/11/2013 14:34:
> @Emmanuel
>
> I'm in Douala
>
> Le vendredi 8 novembre 2013, Yannick Guigui a écrit :
>
> I know kiwix but the webapp that I built uses these tecnologies:
>
> - WebStorage
> When the students are in the wifi area,they can consult all the
> articles by the chrome browser (only Google chrome browser is
> required).When the students after are in their home where there is
> not wifi connection,they can consult all the article that they
> consulted when they were in school by the webapp because the webapp
> recorded all articles consulted.
With Kiwix they can download them locally; if you think it's important
to have many pages exportable at once, it shouldn't be too hard to add
such a feature I suppose.
>
> - Real Time Communication
> Students can tchat in article with other people in order to share
> knowledge and idea inreal time when they are on the wifi area.The
> chat seems like the facebook tchat.
>
> - Webrtc
> Users(educator,professor,students) can also communicate by
> visioconference (only stable on Google chrome) in this webapp.They
> can use this feature to call somebody in the area of the wifi and
> have real time visioconference by their webcam.
>
> I tell it because I know that kiwix is also a browser but I have a
> full power of new feature when I use browsers like Chrome or Firefox
> and that why visioconference is possible via Webrtc technology.The
> webapp is responsive design,and compatible with all OS that can have
> browser.
It's clear that Kiwix doesn't do this :) but you could just use Ekiga or
something with your local server. Just saying.
>
> If I get small images of wikipedia (french and english) my final
> problem will be resolved.I use the mediawiki API to serve the
> webapp.The images that I ask will be put in mediawiki. I don't know
> which solution you can give to me.If the zim format can be
> undecompressed in order to extract images to use in mediawiki,
I don't know if it can. Should be possible, in principle. I don't know
if easier than just downloading the thumbs yourself.
Nemo
> I can
> use this solution.What do you think and what do you propose for my
> mediawiki.
>
> Tank a lot (sorry for my English,I speak french)
>
> Le vendredi 8 novembre 2013, Federico Leva (Nemo) a écrit :
>
> Yannick Guigui, 08/11/2013 12:22:
>
> I'm camerounian I built a webapp whose allows students to
> consult
> wikipedia articles
>
>
> So you don't need the originals, only thumbs.
>
> without internet connectivity,many school accepted
> the application and the application is hosted on a server
> and shared by
> wifi on each school.
>
>
> Nice! It seems you may want to use this existing software
> solution: http://kiwix.org/wiki/Kiwix-__serve
> <http://kiwix.org/wiki/Kiwix-serve>
> That way, you can use the available ZIM files with no need to
> generate or download (and compress) the thumbnails yourself.
> Example: http://www.wikimedia.fr/__afripedia
> <http://www.wikimedia.fr/afripedia>
> Your help developing the software would be very useful and you
> could avoid doing yourself what you don't have the resources
> (bandwidth) to do.
>
> I have all the other dumps of wikipedia articles in
> french and english;but I don't have any image because they
> are too heavy
> for me to be downloaded to my side (3 TB) and I have a low
> bandwidth (40
> ko/s when it's fast).
>
> The webapp works on a browser and i don't know if the zim
> format can be
> undecompressed to get small images (jpeg,png,svg...).
>
>
> Kiwix is a browser, you can save anything you want AFAIK.
>
> Nemo
>
>
> This is the video demo (3min in french) of the webapp
> :https://www.youtube.com/__watch?v=0f-HJhOw1-U
> <https://www.youtube.com/watch?v=0f-HJhOw1-U>
>
> If I get small images in french and english to download to
> the app,my
> problem will revolved.
>
> Tank a lot Federico
>
>
>
> Le vendredi 8 novembre 2013, Federico Leva (Nemo) a écrit :
>
> Yannick Guigui, 08/11/2013 10:11:
>
> Please I want to get all images of wikipedia frensh
> and English,
> I much
> did it cost to book it on hardisk? In can't
> download it because
> I don't
> have enought bandwidth from my country.
>
>
> What do you need them for?
> Originals would be about 2+1 TB and anyone can download
> and ship
> them for you:
> http://ftpmirror.your.org/pub/____wikimedia/imagedumps/__tarballs/__fulls/2…
> <http://ftpmirror.your.org/pub/__wikimedia/imagedumps/tarballs/__fulls/20121…>
>
> <http://ftpmirror.your.org/__pub/wikimedia/imagedumps/__tarballs/fulls/20121…
> <http://ftpmirror.your.org/pub/wikimedia/imagedumps/tarballs/fulls/20121201/>>
> Otherwise there are the ZIM files with thumbnails
> compressed,
> fr.wiki is 14 GB but en.wiki is not available yet.
> http://download.kiwix.org/zim/____0.9/
> <http://download.kiwix.org/zim/__0.9/>
> <http://download.kiwix.org/__zim/0.9/
> <http://download.kiwix.org/zim/0.9/>>
>
> Nemo
>
Yannick Guigui, 08/11/2013 12:22:
> I'm camerounian I built a webapp whose allows students to consult
> wikipedia articles
So you don't need the originals, only thumbs.
> without internet connectivity,many school accepted
> the application and the application is hosted on a server and shared by
> wifi on each school.
Nice! It seems you may want to use this existing software solution:
http://kiwix.org/wiki/Kiwix-serve
That way, you can use the available ZIM files with no need to generate
or download (and compress) the thumbnails yourself.
Example: http://www.wikimedia.fr/afripedia
Your help developing the software would be very useful and you could
avoid doing yourself what you don't have the resources (bandwidth) to do.
> I have all the other dumps of wikipedia articles in
> french and english;but I don't have any image because they are too heavy
> for me to be downloaded to my side (3 TB) and I have a low bandwidth (40
> ko/s when it's fast).
>
> The webapp works on a browser and i don't know if the zim format can be
> undecompressed to get small images (jpeg,png,svg...).
Kiwix is a browser, you can save anything you want AFAIK.
Nemo
>
> This is the video demo (3min in french) of the webapp
> :https://www.youtube.com/watch?v=0f-HJhOw1-U
>
> If I get small images in french and english to download to the app,my
> problem will revolved.
>
> Tank a lot Federico
>
>
>
> Le vendredi 8 novembre 2013, Federico Leva (Nemo) a écrit :
>
> Yannick Guigui, 08/11/2013 10:11:
>
> Please I want to get all images of wikipedia frensh and English,
> I much
> did it cost to book it on hardisk? In can't download it because
> I don't
> have enought bandwidth from my country.
>
>
> What do you need them for?
> Originals would be about 2+1 TB and anyone can download and ship
> them for you:
> http://ftpmirror.your.org/pub/__wikimedia/imagedumps/tarballs/__fulls/20121…
> <http://ftpmirror.your.org/pub/wikimedia/imagedumps/tarballs/fulls/20121201/>
> Otherwise there are the ZIM files with thumbnails compressed,
> fr.wiki is 14 GB but en.wiki is not available yet.
> http://download.kiwix.org/zim/__0.9/
> <http://download.kiwix.org/zim/0.9/>
>
> Nemo
>
v0.11.0 is a general release. The files are available here:
https://sourceforge.net/projects/xowa/files/v0.11.0/
These are the major changes since the last announcement (v0.9.0):
* Offline thumbnails for simplewiki
* Windows 64 bit JRE package
* Search word database for improved search performance.
* Scribunto fixes for English Wiktionary
* Support for 2013-09-10 Wikidata wikis.
* Special:Statistics page.
More changes are listed at [[Help:Change_log]].
As always, any feedback is appreciated.
Thanks.