New subject: Fwd: Mediawiki offline

18 Jun 2012


      Hi all,
I am wondering whether you could give me some tips to find an
appropriate solution for our scenario! We have got a small amount of
resources to throw at this, and can hopefully develop something that
will be useful for others.
We have a wiki here:
http://orbit.educ.cam.ac.uk/
which has a small number of pages (by comparison!), but it does contain video.
We would like to be able to create offline (read-only) copy of this.
This is because some of it is aimed at teacher education in Africa,
where connectivity it poor. At one of the schools we work with we have
a low-power server, with a local wifi network, so basically we'd like
to get an offline copy onto that server. Html is probably the best
solution (as it could then be indexed for searching on our server),
and we want to be able to generate an update fairly frequently (over
low bandwidht) so updates need to be incremental.
One approach I've taken in the past was this: pull html versions of
pages via the API (and doing regexp on them, e.g. to preserve local
links between pages, but leave links for editing pointing at the
online version). This could be a good solution, but it requires
creating of a static html version on the server, that's then
periodically updates. It also requires a lot of hacky regexp to get
the pulled html into the right format. It would be good to have
something in the API that could output pages straight as a 'localised'
html format. Another issue was that the API doesn't easily allow
finding the most recent version of all pages that have changes since a
certain date, so I had to mess around with the OAI extension to get
this information. Overall, I got it to work, but it relied on so many
hacks that it wasn't really maintanable.
What would be ideal would be to have a local script (i.e. on the
remote server) managing this, without us having to create an
(intermediary) html copy ever so often. The remote server contacts our
wiki when it has connectivity, and fetches updates at night. The only
thing the wiki does is produce a list pages that have changed since a
certain date, and (via the API) provides suitable html for download. I
should say that massive scalability isn't an issue: We won't have
loads of pages any time soon, and we won't have lots of mirror sites.
Once we've got an html version (i.e. wiki -> local html), we'd also
like to build a little app for Android, that can download such an html
copy (again, ideally straight from a wiki, without intermediary html
copy somewhere). The app would manage downloading the updates, keep
content as html on the SD card, and would allow users to launch the SD
card content (into a mobile browser). (We'd need to take care of how
to hand video, probably use HTML5 video tags.)
I would really appreciate your feedback on this! The above strategy
(modify the API to give localised html) seems simple enough to me - do
you have particular views on what could work best?
Bjoern
(Btw. I am also interested in ePub, but that's a different scenario!)

Mediawiki offline

/Manuel