Hello,
I am making an application in Python which will make heavy use of Wikipedia data.
I've downloaded the 10.4GB .zim archive of Wikipedia's English articles, so that I
can run this applicaiton offline, and so I don't have to rely on extensive hits to
Wikipedia's API or website.
Perhaps my google-fu is very rusty, but I can't find any documentation or tools for
accessing the zim file from python. I've found pyzim
(
https://code.launchpad.net/zim/pyzim), but this appears to be a python build of the zim
reader and writer, not a python library to allow me to search and access information in
the zim file from my own code. If does include a library of such functions, I can't
find them, or any documentation or examples, etc. Ideally, I'd like to be able to
search the zim file just like searching wikipedia (the zim file is already indexed as I
understand it?), and I'd like to be able to pull out articles, in json form for
example.
Can someome point me to a resource to get me started with this?
I hope this is an ok question for this mailing list. If not, any suggestions for a better
place to ask this question would be most welcome!
Thanks in advance!