2011/11/18 Ariel T. Glenn <ariel@wikimedia.org>
As I said below, providing multiterabyte dumps does not seem reasonable
to me.  

What is the problem? Bandwidth? Disk space?
 
Monthly incrementals don't provide a workaround, unless you are
suggesting that we put dumps online for every month since the beginning
of the project.  

Yes, indeed.
 
I think that a much more workable way to jump-start a
mirror is to copy directly to disks in the datacenter, for an
organization which will provide public access to its copy.  This
requires three things: 1) an organization that wants to host such a
mirror, 2) them sending us disks, 3) me clearing it with Rob and with
our datacenter tech, but he's agreed to this in principle in the past.


Ariel

Στις 17-11-2011, ημέρα Πεμ, και ώρα 14:11 +0100, ο/η emijrp έγραψε:
> People can't mirror Commons if there is no public image dump. As there
> is no public image dump, people don't care about mirror. And so on...
>
> You can offer monthly incremental image dumps.[1] Until mid-2008,
> month uploads are lower than 100 GB. Recently, it is on the 200-300GB
> rage. People is mirroring Domas visit logs at Internet Archive, ok,
> Commons monthly size in this case is about 10x, but it is not
> impossible. Arcnhive Team has mirrored GeoCities (0.9TB), Yahoo!
> Videos (20TB), Jamendo (2.5TB) and other huge sites. So, if you put
> that image dumps online, they are going to rage-download all.
>
> You can start offering full resolution monthly dumps until 2007 or
> similar. But, man, we have to restart this soon or later.
>
> [1]
> http://archiveteam.org/index.php?title=Wikimedia_Commons#Size_stats
>
> 2011/11/17 Ariel T. Glenn <ariel@wikimedia.org>
>         I had a quick look and it turns out that the English language
>         Wikipedia
>         uses over 2.8 million images today.  So, as you point out, an
>         off line
>         reader that just used thumbnails would still have to be
>         selective about
>         its image use.
>
>         In any case, putting together collections of thumbs doesn't
>         resolve the
>         need for a mirror of the originals, which I would really like
>         to see
>         happen.
>
>         Ariel
>
>         Στις 17-11-2011, ημέρα Πεμ, και ώρα 01:46 +0100, ο/η Erik
>         Zachte έγραψε:
>
>         > Ariel:
>         > > Providing multiple terabyte sized files for download
>         doesn't make any kind of sense to me. However, if we get
>         concrete proposals for categories of Commons images people
>         really want and would use, we can put those together. I think
>         this has been said before on wikitech-l if not here.
>         >
>         > There is another way to cut down on download size, which
>         would serve a whole class of content re-users, e.g. offline
>         readers.
>         > For offline readers it is not so important to have pictures
>         of 20 Mb each, rather to have pictures at all, preferably 10's
>         Kb's in size.
>         > A download of all images, scaled down to say 600x600 max
>         would be quite appropriate for many uses.
>         > Map and diagrams would not survive this scale down
>         (illegible text), but are very compact already.
>         > In fact the compress ratio of each image is very reliable
>         predictor of the type of content.
>         >
>         > In 2005 I distributed a DVD [1] with all unabridged texts
>         for English Wikipedia and all 320,000 images on one DVD, to be
>         loaded on 4Gb CF card for handheld.
>         > Now we have 10 million images on Commons, so even scaled
>         down images would need some filtering, but any collection
>         would still be 100-1000 times smaller in size.
>         >
>         > Erik Zachte
>         >
>         > [1] http://www.infodisiac.com/Wikipedia/
>         >
>         >
>         >
>         > _______________________________________________
>         > Xmldatadumps-l mailing list
>         > Xmldatadumps-l@lists.wikimedia.org
>         > https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l
>
>
>
>         _______________________________________________
>         Xmldatadumps-l mailing list
>         Xmldatadumps-l@lists.wikimedia.org
>         https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l
>
>