I see you are working on this https://wikitech.wikimedia.org/view/Dumps/Image_dumps

I don't have account there (how can i request one?). Why don't you offer incremental image backups, in one-day chunks? Since 2004-09-07 to (today - 1 year) to leave enough time to remove copyvios.

2011/12/2 Ariel T. Glenn <ariel@wikimedia.org>
Óôéò 18-11-2011, çìÝñá Ðáñ, êáé þñá 11:49 +0200, ï/ç Ariel T. Glenn
Ýãñáøå:

>
> There are scripts to download all media used on a project
> ( http://meta.wikimedia.org/wiki/Wikix ).  As long as the end user runs
> one command, it doesn't matter what's happening on the back end.
>
> > _and_ it needs to be possible for any consumer to perform the task of
> > obtaining the source.  Does the WMF block people who attempt to mirror
> > the project content one item at a time?  IMO blocking them is very
> > sane, but if that is the only way to obtain the source then it would
> > again be breaking the licence.
>
> AFAIK we do not block folks that are making serial requests, even if
> they crawl the entire media space.  Serial requests don't incur a big
> cost on our servers.

I should clarify this.

Crawling the media server and requesting all images one at a time (as
long as a pile of people aren't doing it at once) is fine.  Requesting
all images in a specific or several thumb sizes is not; in the first
case we serve files that already exist while in the second case the
files may need to be generated and put someplace.  And we simply don't
have space to keep generated thumbs of every image on commons in various
arbitrary sizes at the moment.  So folks that *do* want to crawl the
media server and request thumbs for all of them should check in with me
so we can figure out how to get you the data you need.

Ariel



_______________________________________________
Xmldatadumps-l mailing list
Xmldatadumps-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l