[Xmldatadumps-l] another month, another dump. ho hum :-P

fox fox91 at anche.no
Fri Sep 16 12:52:42 UTC 2011


Il 08/09/2011 09:22, Ariel T. Glenn ha scritto:
> I expect people will need a script to download these files easily;
> didn't someone on this list have a tool in the works?

I wrote this simple bash script
https://github.com/SoNetFBK/wiki-network/blob/master/download_dumps.sh

It's really simple to use.
Usage: download_dumps.sh LANG [OUTPUT_DIR] [MATCHING_STRING]

Examples:
- download_dumps.sh en -> downloads every lastest file from enwiki
- download_dumps.sh en /mydata/dumps -> the same but saves everything in
/mydata/dumps
- download_dumps.sh en /mydata/dumps history -> the same but downloads
only the files that contain the word "history" in the name (you can use
regex too!)

p.s.: in the same repo you'll find other interesting stuff to analyze
the dumps (extracting a social network from user talk pages, content
analysis, ecc..). If you need other info write me ;)


-- 
f.

  "I didn't try, I succeeded"
  (Dr. Sheldon Cooper, PhD)

()  ascii ribbon campaign - against html e-mail
/\  www.asciiribbon.org   - against proprietary attachments

http://about.me/fox91

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 900 bytes
Desc: OpenPGP digital signature
Url : http://lists.wikimedia.org/pipermail/xmldatadumps-l/attachments/20110916/df405f65/attachment.pgp 


More information about the Xmldatadumps-l mailing list