Hi all,
I am wondering whether you could give me some tips to find an appropriate solution for our scenario! We have got a small amount of resources to throw at this, and can hopefully develop something that will be useful for others.
We have a wiki here: http://orbit.educ.cam.ac.uk/ which has a small number of pages (by comparison!), but it does contain video.
We would like to be able to create offline (read-only) copy of this. This is because some of it is aimed at teacher education in Africa, where connectivity it poor. At one of the schools we work with we have a low-power server, with a local wifi network, so basically we'd like to get an offline copy onto that server. Html is probably the best solution (as it could then be indexed for searching on our server), and we want to be able to generate an update fairly frequently (over low bandwidht) so updates need to be incremental.
One approach I've taken in the past was this: pull html versions of pages via the API (and doing regexp on them, e.g. to preserve local links between pages, but leave links for editing pointing at the online version). This could be a good solution, but it requires creating of a static html version on the server, that's then periodically updates. It also requires a lot of hacky regexp to get the pulled html into the right format. It would be good to have something in the API that could output pages straight as a 'localised' html format. Another issue was that the API doesn't easily allow finding the most recent version of all pages that have changes since a certain date, so I had to mess around with the OAI extension to get this information. Overall, I got it to work, but it relied on so many hacks that it wasn't really maintanable.
What would be ideal would be to have a local script (i.e. on the remote server) managing this, without us having to create an (intermediary) html copy ever so often. The remote server contacts our wiki when it has connectivity, and fetches updates at night. The only thing the wiki does is produce a list pages that have changed since a certain date, and (via the API) provides suitable html for download. I should say that massive scalability isn't an issue: We won't have loads of pages any time soon, and we won't have lots of mirror sites.
Once we've got an html version (i.e. wiki -> local html), we'd also like to build a little app for Android, that can download such an html copy (again, ideally straight from a wiki, without intermediary html copy somewhere). The app would manage downloading the updates, keep content as html on the SD card, and would allow users to launch the SD card content (into a mobile browser). (We'd need to take care of how to hand video, probably use HTML5 video tags.)
I would really appreciate your feedback on this! The above strategy (modify the API to give localised html) seems simple enough to me - do you have particular views on what could work best?
Bjoern
(Btw. I am also interested in ePub, but that's a different scenario!)