Re: [Xmldatadumps-l] [Wikitech-l] Bulk download

23 Sep 2013

Jeremy Baron, 23/09/2013 16:11:
...
  On Sep 23, 2013 9:25 AM, "Mihai Chintoanu"
&lt;mihai.chintoanu(a)skobbler.com
 <mailto:mihai.chintoanu@skobbler.com>> wrote:
  I have a list of about 1.8 million images which I
have to download 
 from commons.wikimedia.org <http://commons.wikimedia.org>.
Is there any
 simple way to do this which doesn't involve an individual HTTP hit for
 each image?

 You mean full size originals, not thumbs scaled to a certain size, right?

 You should rsync from a mirror[0] (rsync allows specifying a list of
 files to copy) 
I agree that rsync is probably your best bet.
Another mirror I'm building is on archive.org, organised by day of 
upload. You can also request an individual file directly from the zips 
but that's not super-efficient.
https://archive.org/search.php?query=subject%3A%22Wikimedia+Commons%22

Nemo

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Re: [Xmldatadumps-l] [Wikitech-l] Bulk download