Il 08/09/2011 09:22, Ariel T. Glenn ha scritto:
I expect people will need a script to download these files easily; didn't someone on this list have a tool in the works?
I wrote this simple bash script https://github.com/SoNetFBK/wiki-network/blob/master/download_dumps.sh
It's really simple to use. Usage: download_dumps.sh LANG [OUTPUT_DIR] [MATCHING_STRING]
Examples: - download_dumps.sh en -> downloads every lastest file from enwiki - download_dumps.sh en /mydata/dumps -> the same but saves everything in /mydata/dumps - download_dumps.sh en /mydata/dumps history -> the same but downloads only the files that contain the word "history" in the name (you can use regex too!)
p.s.: in the same repo you'll find other interesting stuff to analyze the dumps (extracting a social network from user talk pages, content analysis, ecc..). If you need other info write me ;)