On Fri, 1 Apr 2022, at 02:26, arithmatlic(a)gmail.com wrote:
Hello data dump enthusiasts,
I've been working on an API (written in Python) for downloading files
from Wikimedia Data Dump files, and I've just released its earliest
version here:
https://github.com/jon-edward/wiki_dump.
I'd love to know what you all think (how it can be improved, how you
are interested in using it, positive/negative remarks) if you can spare
the time.
Great, I was looking for something like that. Had some troubles getting the example to
run, opened an issue on GH.
I'm interested in support for the new HTML dumps, and some logic to automatically get
files from the local storage when the code is run on toolforge.
– Jan