2010/12/16 Ángel González keisial@gmail.com:
On 15/12/10 16:21, Andrew Dunbar wrote:
I've long been interested in offline tools that make use of WikiMedia information, particularly the English Wiktionary.
I've recently come across a tool which can provide random access to a bzip2 archive without decompressing it and I would like to make use of it in my tools but I can't get it to compile and/or function with any free Windows compiler I have access to. It works fine on the *nix boxes I have tried but my personal machine is a Windows XP netbook.
The tool is "seek-bzip2" by James Taylor and is available here: http://bitbucket.org/james_taylor/seek-bzip2
- The free Borland compiler won't compile it due to missing (Unix?) header files
- lcc compiles it but it always fails with error "unexpected EOF"
- mingw compiles it if the -m64 option is removed from the Makefile
but it then has the same behaviour as the lcc build.
My C experience is now quite stale and my 64-bit programming experience negligible.
(I'm also interested in hearing from other people working on offline tools for dump files, wikitext parsing, or Wiktionary)
Andrew Dunbar (hippietrail)
Your problem are Windows text streams. The attached patch fixes it.
Thank you for the link. I was completely unaware of it when I basically did the same thing for mediawiki a couple years ago. http://www.wiki-web.es/mediawiki-offline-reader/
Thanks Ángel! I feel like a fool for not realizing this. It's the same problem I've worked around many times in the past but not recently. I just got a similar answer on stackoverflow.com
By the way I'm keen to find something similar for .7z
It would be incredibly useful if these indices could be created as part of the dump creation process. Should I file a feature request?
Andrew Dunbar (hippietrail)