Re: [Xmldatadumps-l] [Wikitech-l] Fwd: Old English Wikipedia image dump from 2005

17 Nov 2011

People can't mirror Commons if there is no public image dump. As there is
no public image dump, people don't care about mirror. And so on...

You can offer monthly incremental image dumps.[1] Until mid-2008, month
uploads are lower than 100 GB. Recently, it is on the 200-300GB rage.
People is mirroring Domas visit logs at Internet Archive, ok, Commons
monthly size in this case is about 10x, but it is not impossible. Arcnhive
Team has mirrored GeoCities (0.9TB), Yahoo! Videos (20TB), Jamendo (2.5TB)
and other huge sites. So, if you put that image dumps online, they are
going to rage-download all.

You can start offering full resolution monthly dumps until 2007 or similar.
But, man, we have to restart this soon or later.

[1] http://archiveteam.org/index.php?title=Wikimedia_Commons#Size_stats

2011/11/17 Ariel T. Glenn &lt;ariel(a)wikimedia.org&gt;

...
  I had a quick look and it turns out that the English
language Wikipedia
 uses over 2.8 million images today.  So, as you point out, an off line
 reader that just used thumbnails would still have to be selective about
 its image use.

 In any case, putting together collections of thumbs doesn't resolve the
 need for a mirror of the originals, which I would really like to see
 happen.

 Ariel

 Στις 17-11-2011, ημέρα Πεμ, και ώρα 01:46 +0100, ο/η Erik Zachte έγραψε:
  Ariel:
 > Providing multiple terabyte sized files for download doesn't make any 
kind of sense to me. However, if we get concrete proposals for categories
 of Commons images people really want and would use, we can put those
 together. I think this has been said before on wikitech-l if not here.

 There is another way to cut down on download size, which would serve a  whole class
of content re-users, e.g. offline readers.
  For offline readers it is not so important to
have pictures of 20 Mb  each, rather to have pictures at all, preferably 10's
Kb's in size.
  A download of all images, scaled down to say
600x600 max would be quite  appropriate for many uses.
  Map and diagrams would not survive this scale
down (illegible text), but  are very compact already.
  In fact the compress ratio of each image is very
reliable predictor of  the type of content.

 In 2005 I distributed a DVD [1] with all unabridged texts for English  Wikipedia
and all 320,000 images on one DVD, to be loaded on 4Gb CF card
 for handheld.
  Now we have 10 million images on Commons, so even
scaled down images  would need some filtering, but any collection would still be
100-1000 times
 smaller in size.

 Erik Zachte

 [1] http://www.infodisiac.com/Wikipedia/

 _______________________________________________
 Xmldatadumps-l mailing list
 Xmldatadumps-l(a)lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l 

 _______________________________________________
 Xmldatadumps-l mailing list
 Xmldatadumps-l(a)lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Re: [Xmldatadumps-l] [Wikitech-l] Fwd: Old English Wikipedia image dump from 2005