On Fri, 1 Apr 2022, at 02:26, arithmatlic@gmail.com wrote:
Hello data dump enthusiasts,
I've been working on an API (written in Python) for downloading files from Wikimedia Data Dump files, and I've just released its earliest version here: https://github.com/jon-edward/wiki_dump.
I'd love to know what you all think (how it can be improved, how you are interested in using it, positive/negative remarks) if you can spare the time.
Great, I was looking for something like that. Had some troubles getting the example to run, opened an issue on GH. I'm interested in support for the new HTML dumps, and some logic to automatically get files from the local storage when the code is run on toolforge.
– Jan