Hi Rupert,
Yes, ZIM is definitely one possibility, and definitely something we would like to explore. We would like to be able to provide our resource on a memory stick, and ZIM could work well for that.
There are two potential drawbacks:
(1) ZIM requires the reader software to read the file, so in some circumstances, a plain html version might be the best way.
(2) Emmanuel mentions that incremental ZIM updates are on the roadmap. For us, that's a very important feature, because we are dealing with low bandwidth - high cost connections. So we have to be able to create incremental updates.
So for now, we'd would probably be best off with ZIM as well as plain html.
Does the ZIM process create a stand-alone html version first, that is usable? That would be interesting.
Emmanuel has offered to create a ZIM file for us, and I am checking with our computing service at the moment whether we can run npm and nodejs on our server.
Bjoern
On 14 November 2013 11:58, rupert THURNER rupert.thurner@gmail.com wrote:
Is a zim file acceptable as well?
Am 14.11.2013 10:50 schrieb "Bjoern Hassler" bjohas+mw@gmail.com:
Hello!
What script would you recommend to create a static offline version of a mediawiki? (Perhaps with and without parsoid?)
I've been looking for a good solution for ages, and have experimented with a few things. Here's what we currently do. It's not perfect, and really a bit too cumbersome, but it works as a proof of concept.
To illustrate: E.g. one of our wiki pages is here: http://orbit.educ.cam.ac.uk/wiki/OER4Schools/What_is_interactive_teaching
We have a "mirror" script, that uses the API to generate an HTML version of a wiki page (which is then 'wrapped' in a basic menu):
http://orbit.educ.cam.ac.uk/orbit_mirror/index.php?page=OER4Schools/What_is_...
(Some log info printed at the bottom of the page, which will provide some hints as to what is going on.)
The resulting page is as low-bandwidth as possible (which is one of our use cases). The original idea with the mirror php script was that you could run it on your own server: It only requests pages if they have changed, and keeps a cache, which allows viewing pages if your server has no connectivity. (You could of course use a cache anyway, and there's advantages/disadvantages compared to this more explicit caching method.) The script rewrites urls so that normal page links stay within the mirror, but links for editing and history point back at the wiki (see tabs along the top of the page).
The mirror script also produces (and caches) a static web page, see here:
http://orbit.educ.cam.ac.uk/orbit_mirror/site/OER4Schools%252FHow_to_run_wor...
Assuming that you've run a wget across the mirror, then the site will be completely mirrored in '/site'. You can then tar up '/site' and distribute it alongside your w/images directory, and you have a static copy, or use rsync to incrementally update '/site' and w/images on another server.
There's also a api-based process, that can work out which pages have changes, and refreshes the mirror accordingly.
Most of what I am using is in the mediawiki software already (i.e. API->html), and it would be great to have a solution like this, that could generate an offline site on the fly. Perhaps one could add another export format to the API, and then an extension could generate the offline site and keep it up to date as pages on the main wiki are changing. Does this make sense? Would anybody be up for collaborating on implementing this? Are there better things in the pipeline?
I can see why you perhaps wouldn't want it for one of the major wikimedia sites, or why it might be inefficient somehow. But for our use cases, for a small-ish wiki, with a set of poorly connected users across the digital divide, it would be fantastic.
So - what are your solutions for creating a static offline copy of a mediawiki?
Looking forward to hearing about it! Bjoern
Offline-l mailing list Offline-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/offline-l
Offline-l mailing list Offline-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/offline-l