Ariel and others have already touched upon this, but just in case you want more details (I'm trying to do something similar):
If your images are centered around one wiki (for example, the 1.8 million images are for articles in English Wikipedia), you can use the tarballs at your.org: http://ftpmirror.your.org/pub/wikimedia/imagedumps/tarballs/fulls/20121201/.
The latest set is from 2012-12, but it should form a good base for your images.
You'd want the tarballs with a prefix "of enwiki-20121201-remote-media". These are all the images for enwiki whose [[File]] page is hosted by commons ("remote"). In contrast, "local" are for the images that are hosted directly by enwiki.
Some rough numbers: - 24 tarball files: each about 90 GB - approximately 2.3 million full-sized originals (not thumbs). Also includes audio, video, pdf, etc. - takes up 2.1 TB of hard disk space - takes about 14 days to download these tarballs (with an 18 Mbps download connection)
Hope this helps.
On Mon, Sep 23, 2013 at 9:58 AM, Ariel T. Glenn ariel@wikimedia.org wrote:
We have a somewhat out of date off site mirror of images (I'm working on the out of date part). This includes commons. It's accessible by rsync, http, ftp:
http://meta.wikimedia.org/wiki/Mirroring_Wikimedia_project_XML_dumps#Media
Thanks again to your.org for hosting that.
Are these images used on some particular project? If so we might be able to do better.
Ariel
Στις 23-09-2013, ημέρα Δευ, και ώρα 15:22 +0200, ο/η Mihai Chintoanu έγραψε:
Hi everyone,
I have a list of about 1.8 million images which I have to download from commons.wikimedia.org. Is there any simple way to do this which doesn't involve an individual HTTP hit for each image?
Many thanks in advance. Mihai _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l