As you may know, the German Wikipedia has been published on DVD several times now, the last one only a few weeks ago. This is thanks to directmedia [1], a German company.
For this last release, they have changed their data format to accomodate the sheer mass that is Wikipedia. Instead of their (proprietary) reader software, they now use a software that acts as a mini-server so one can read and search Wikipedia from DVD or harddisk in the browser. While this mini-server is proprietary as well, they have asked for the development of an open source implementation, opening their format description [2].
DaB has written a version in Java, and I have started one in C++/wxWidgets. I put mine into SVN, "yawr" (Yet Another Wikipedia Reader). Article browsing with images works quite well already, and I'm working on (fulltext!) search, some special pages, and a zillion other things.
Just letting you know what that thing is I put in SVN, and inviting everyone to help.
Merry Christmas, Magnus
[1] http://www.digitale-bibliothek.de [2] http://wiki.directmedia.de/index.php/ZenoReader/Open_Source
DaB has written a version in Java, and I have started one in C++/wxWidgets. I put mine into SVN, "yawr" (Yet Another Wikipedia Reader). Article browsing with images works quite well already, and I'm working on (fulltext!) search, some special pages, and a zillion other things.
There is also a reader written in C++ using the small C++-application server tntnet. It can already read and display the zeno file format (served via HTTP) and will be able to use the Wikimedia XML dumps soon.
Currently there are some developers working on a port of tntnet to Windows (using the poco classes to make it platform independent). As soon as the ready will be ready I will enhance it with a little dock application which will make it usable on a desktop system (as a HTTP daemon tntnet is now just a plain console-based application).
On 25/12/06, Manuel Schneider manuel.schneider@wikimedia.ch wrote:
Currently there are some developers working on a port of tntnet to Windows (using the poco classes to make it platform independent). As soon as the ready will be ready I will enhance it with a little dock application which will make it usable on a desktop system (as a HTTP daemon tntnet is now just a plain console-based application).
I am personally interested in seeing if the three different versions reconcile over time, and include similar features and approaches to different issues. At the end of all this, it would be awesome if we had a single, platform-independent reader for a format we could generate as part of the dump process - it would go a very long way, I feel, to encouraging other projects to get up off their wotsits and publish something.
Rob Church
On 12/25/06, Magnus Manske magnusmanske@googlemail.com wrote:
DaB has written a version in Java, and I have started one in C++/wxWidgets. I put mine into SVN, "yawr" (Yet Another Wikipedia Reader). Article browsing with images works quite well already, and I'm working on (fulltext!) search, some special pages, and a zillion other things.
The Polish Wikipedia also has a reader/browser, which makes and index during instalation and then uses a HTML dump in the browser to show out the contents.
Myself, I'm working on a XUL interface which could directly search XML or SQL files. So yeah, looks like we have a lot of readers going on.
wikitech-l@lists.wikimedia.org