I have converted the whole English Wikipedia to a TomeRaider file. The complete file is now available on my Pocket PC.
You will need a storage card for this: the file is roughly 100 Mb, contains 145000+ entries (including redirects) and 200 Mb of (compressed) text. I chose to strip all meta info (discussions, user pages, etc).
TomeRaider (shareware) is a blindingly fast text retrieval system for handhelds (Pocket PC, Palm*, and EPOC*) and MS Windows. Index lookup is sub second. Rendering very large articles or complex html tables may take a few seconds. Images are not supported. Hyperlinks are. Most html tags as well.
I wrote a perl script to accomplish this. Unfortunately I do not have the web space or bandwidth to offer the file for download from my site.
So either any user or the Wikipedia support team will have to redo the conversion. Conversion to a TomeRaider input file is completely automated. Conversion plus import in TomeRaider takes about 90 minutes on my 500 MHz PC.
For screen shots and additional info about how to do this yourself: http://members.ams.chello.nl/epzachte/Wikipedia
Erik Zachte
* Not tested. I only have a PPC.
On Sun, 2003-04-06 at 11:59, Erik Zachte wrote:
I have converted the whole English Wikipedia to a TomeRaider file. The complete file is now available on my Pocket PC.
Cool!
I chose to strip all meta info (discussions, user pages, etc).
Be sure to include a URL to the original site for each page.
Images are not supported.
Note that some images include alt text, which should be preserved:
[[Image:filename.ext|Alt text if image not available]]
I wrote a perl script to accomplish this. Unfortunately I do not have the web space or bandwidth to offer the file for download from my site.
So either any user or the Wikipedia support team will have to redo the conversion. Conversion to a TomeRaider input file is completely automated. Conversion plus import in TomeRaider takes about 90 minutes on my 500 MHz PC.
Creating the TomeRaider file seems to require proprietary Windows-only software, but if there's interest we could periodically produce the converted dumps which can be used as input to it.
For screen shots and additional info about how to do this yourself: http://members.ams.chello.nl/epzachte/Wikipedia
-- brion vibber (brion @ pobox.com)
Erik has provided converted TomeRaider files of the English, Dutch, and German wikipedias, which I have put up for public download at:
http://www.wikipedia.org/tarballs/tomeraider/
(The files are compressed with bzip2.)
I'm sure Erik would appreciate feedback from any PDA users who try them out! If there's interest, we can probably work out semiregular updates.
-- brion vibber (brion @ pobox.com)
On Sun, 2003-04-06 at 11:59, Erik Zachte wrote:
I have converted the whole English Wikipedia to a TomeRaider file. The complete file is now available on my Pocket PC.
You will need a storage card for this: the file is roughly 100 Mb, contains 145000+ entries (including redirects) and 200 Mb of (compressed) text. I chose to strip all meta info (discussions, user pages, etc).
TomeRaider (shareware) is a blindingly fast text retrieval system for handhelds (Pocket PC, Palm*, and EPOC*) and MS Windows. Index lookup is sub second. Rendering very large articles or complex html tables may take a few seconds. Images are not supported. Hyperlinks are. Most html tags as well.
I wrote a perl script to accomplish this. Unfortunately I do not have the web space or bandwidth to offer the file for download from my site.
So either any user or the Wikipedia support team will have to redo the conversion. Conversion to a TomeRaider input file is completely automated. Conversion plus import in TomeRaider takes about 90 minutes on my 500 MHz PC.
For screen shots and additional info about how to do this yourself: http://members.ams.chello.nl/epzachte/Wikipedia
Erik Zachte
- Not tested. I only have a PPC.
wikipedia-l@lists.wikimedia.org