Re: [Xmldatadumps-l] [Wikitech-l] Fwd: Old English Wikipedia image dump from 2005

31 Jan 2012

I don't plan to do dailies any time soon.  We don't even have real
incrementals for the text revs, which people have been begging for
forever; cleaning up the current adds/changes dumps and making them more
useful (and making them stable) has to be first.  For the images, we
need to get the main bulk of the images out of here and into other
folks' hands first.  Dailies, if they were to happen, would be quite
some time down the road; it's why I haven't written them into the plan.
Note that we don't have a place to keep a second copy of everything from
2004 til now, which is another reason I can't go that route right now.

To get an account on wikitech, please give me a user name you want and
an email address you prefer and I'll set you up.  

Ariel

Στις 30-01-2012, ημέρα Δευ, και ώρα 23:42 +0100, ο/η emijrp έγραψε:
...
  I see you are working on
 this https://wikitech.wikimedia.org/view/Dumps/Image_dumps

 I don't have account there (how can i request one?). Why don't you
 offer incremental image backups, in one-day chunks? Since 2004-09-07
 to (today - 1 year) to leave enough time to remove copyvios.

 2011/12/2 Ariel T. Glenn &lt;ariel(a)wikimedia.org&gt;
         Στις 18-11-2011, ημέρα Παρ, και ώρα 11:49 +0200, ο/η Ariel T.
         Glenn
         έγραψε:

 There are scripts to download all media used on a project
 ( http://meta.wikimedia.org/wiki/Wikix ).  As long as the          end user runs
  one command, it doesn't matter what's
happening on the back          end.

 > _and_ it needs to be possible for any consumer to perform          the task
of
  > obtaining the source.  Does the WMF block
people who          attempt to mirror
  > the project content one item at a time?  IMO
blocking them          is very
  > sane, but if that is the only way to obtain
the source          then it would
   again be
breaking the licence. 
 AFAIK we do not block folks that are making serial requests,          even if
  they crawl the entire media space.  Serial
requests don't          incur a big
  cost on our servers.          

         I should clarify this.

         Crawling the media server and requesting all images one at a
         time (as
         long as a pile of people aren't doing it at once) is fine.
          Requesting
         all images in a specific or several thumb sizes is not; in the
         first
         case we serve files that already exist while in the second
         case the
         files may need to be generated and put someplace.  And we
         simply don't
         have space to keep generated thumbs of every image on commons
         in various
         arbitrary sizes at the moment.  So folks that *do* want to
         crawl the
         media server and request thumbs for all of them should check
         in with me
         so we can figure out how to get you the data you need.

         Ariel

         _______________________________________________
         Xmldatadumps-l mailing list
         Xmldatadumps-l(a)lists.wikimedia.org
         https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Re: [Xmldatadumps-l] [Wikitech-l] Fwd: Old English Wikipedia image dump from 2005