[Xmldatadumps-l] another month, another dump. ho hum :-P
fox
fox91 at anche.no
Fri Sep 16 12:52:42 UTC 2011
Il 08/09/2011 09:22, Ariel T. Glenn ha scritto:
> I expect people will need a script to download these files easily;
> didn't someone on this list have a tool in the works?
I wrote this simple bash script
https://github.com/SoNetFBK/wiki-network/blob/master/download_dumps.sh
It's really simple to use.
Usage: download_dumps.sh LANG [OUTPUT_DIR] [MATCHING_STRING]
Examples:
- download_dumps.sh en -> downloads every lastest file from enwiki
- download_dumps.sh en /mydata/dumps -> the same but saves everything in
/mydata/dumps
- download_dumps.sh en /mydata/dumps history -> the same but downloads
only the files that contain the word "history" in the name (you can use
regex too!)
p.s.: in the same repo you'll find other interesting stuff to analyze
the dumps (extracting a social network from user talk pages, content
analysis, ecc..). If you need other info write me ;)
--
f.
"I didn't try, I succeeded"
(Dr. Sheldon Cooper, PhD)
() ascii ribbon campaign - against html e-mail
/\ www.asciiribbon.org - against proprietary attachments
http://about.me/fox91
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 900 bytes
Desc: OpenPGP digital signature
Url : http://lists.wikimedia.org/pipermail/xmldatadumps-l/attachments/20110916/df405f65/attachment.pgp
More information about the Xmldatadumps-l
mailing list