Re: [Xmldatadumps-l] [Wikitech-l] Fwd: Old English Wikipedia image dump from 2005

31 Jan 2012


      I don't plan to do dailies any time soon.  We don't even have real
incrementals for the text revs, which people have been begging for
forever; cleaning up the current adds/changes dumps and making them more
useful (and making them stable) has to be first.  For the images, we
need to get the main bulk of the images out of here and into other
folks' hands first.  Dailies, if they were to happen, would be quite
some time down the road; it's why I haven't written them into the plan.
Note that we don't have a place to keep a second copy of everything from
2004 til now, which is another reason I can't go that route right now.
To get an account on wikitech, please give me a user name you want and
an email address you prefer and I'll set you up.
Ariel
Στις 30-01-2012, ημέρα Δευ, και ώρα 23:42 +0100, ο/η emijrp έγραψε:
...
I see you are working on
this https://wikitech.wikimedia.org/view/Dumps/Image_dumps
I don't have account there (how can i request one?). Why don't you
offer incremental image backups, in one-day chunks? Since 2004-09-07
to (today - 1 year) to leave enough time to remove copyvios.
2011/12/2 Ariel T. Glenn ariel@wikimedia.org
        Στις 18-11-2011, ημέρα Παρ, και ώρα 11:49 +0200, ο/η Ariel T.
        Glenn
        έγραψε:
    >
    > There are scripts to download all media used on a project
    > ( http://meta.wikimedia.org/wiki/Wikix ).  As long as the
    end user runs
    > one command, it doesn't matter what's happening on the back
    end.
    >
    > > _and_ it needs to be possible for any consumer to perform
    the task of
    > > obtaining the source.  Does the WMF block people who
    attempt to mirror
    > > the project content one item at a time?  IMO blocking them
    is very
    > > sane, but if that is the only way to obtain the source
    then it would
    > > again be breaking the licence.
    >
    > AFAIK we do not block folks that are making serial requests,
    even if
    > they crawl the entire media space.  Serial requests don't
    incur a big
    > cost on our servers.


    I should clarify this.

    Crawling the media server and requesting all images one at a
    time (as
    long as a pile of people aren't doing it at once) is fine.
     Requesting
    all images in a specific or several thumb sizes is not; in the
    first
    case we serve files that already exist while in the second
    case the
    files may need to be generated and put someplace.  And we
    simply don't
    have space to keep generated thumbs of every image on commons
    in various
    arbitrary sizes at the moment.  So folks that *do* want to
    crawl the
    media server and request thumbs for all of them should check
    in with me
    so we can figure out how to get you the data you need.


    Ariel


    _______________________________________________
    Xmldatadumps-l mailing list
    Xmldatadumps-l@lists.wikimedia.org
    https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Re: [Xmldatadumps-l] [Wikitech-l] Fwd: Old English Wikipedia image dump from 2005