I don't plan to do dailies any time soon. We don't even have real incrementals for the text revs, which people have been begging for forever; cleaning up the current adds/changes dumps and making them more useful (and making them stable) has to be first. For the images, we need to get the main bulk of the images out of here and into other folks' hands first. Dailies, if they were to happen, would be quite some time down the road; it's why I haven't written them into the plan. Note that we don't have a place to keep a second copy of everything from 2004 til now, which is another reason I can't go that route right now.
To get an account on wikitech, please give me a user name you want and an email address you prefer and I'll set you up.
Ariel
Στις 30-01-2012, ημέρα Δευ, και ώρα 23:42 +0100, ο/η emijrp έγραψε:
I see you are working on this https://wikitech.wikimedia.org/view/Dumps/Image_dumps
I don't have account there (how can i request one?). Why don't you offer incremental image backups, in one-day chunks? Since 2004-09-07 to (today - 1 year) to leave enough time to remove copyvios.
2011/12/2 Ariel T. Glenn ariel@wikimedia.org Στις 18-11-2011, ημέρα Παρ, και ώρα 11:49 +0200, ο/η Ariel T. Glenn έγραψε:
> > There are scripts to download all media used on a project > ( http://meta.wikimedia.org/wiki/Wikix ). As long as the end user runs > one command, it doesn't matter what's happening on the back end. > > > _and_ it needs to be possible for any consumer to perform the task of > > obtaining the source. Does the WMF block people who attempt to mirror > > the project content one item at a time? IMO blocking them is very > > sane, but if that is the only way to obtain the source then it would > > again be breaking the licence. > > AFAIK we do not block folks that are making serial requests, even if > they crawl the entire media space. Serial requests don't incur a big > cost on our servers. I should clarify this. Crawling the media server and requesting all images one at a time (as long as a pile of people aren't doing it at once) is fine. Requesting all images in a specific or several thumb sizes is not; in the first case we serve files that already exist while in the second case the files may need to be generated and put someplace. And we simply don't have space to keep generated thumbs of every image on commons in various arbitrary sizes at the moment. So folks that *do* want to crawl the media server and request thumbs for all of them should check in with me so we can figure out how to get you the data you need. Ariel _______________________________________________ Xmldatadumps-l mailing list Xmldatadumps-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l