On 12/09/2011 05:52 PM, Platonides wrote:
I'm surprised by the number of uncompressed files there (ie .xml or .sql). Many times it wouldn't even be needed to decompress them.
The popular pywikipediabot framework has an -xml: option, and I used to believe that it required the filename of an uncompressed XML file. But I was wrong. The following works just fine:
python replace.py -lang:da \ -xml:../dumps/dawiki/dawiki-20110404-pages-articles.xml.bz2 \ dansk svensk
If the following would also work (but it does not), we wouldn't have to worry about disk space at all:
python replace.py -lang:da \
-xml:http://dumps.wikimedia.org/dawiki/20111202/dawiki-20111202-pages-articles.xm... \ dansk svensk