Re: [Xmldatadumps-l] [Wikitech-l] Fwd: Old English Wikipedia image dump from 2005

17 Nov 2011


      I had a quick look and it turns out that the English language Wikipedia
uses over 2.8 million images today.  So, as you point out, an off line
reader that just used thumbnails would still have to be selective about
its image use.
In any case, putting together collections of thumbs doesn't resolve the
need for a mirror of the originals, which I would really like to see
happen.
Ariel
Στις 17-11-2011, ημέρα Πεμ, και ώρα 01:46 +0100, ο/η Erik Zachte έγραψε:
...
Ariel:
...
Providing multiple terabyte sized files for download doesn't make any kind of sense to me. However, if we get concrete proposals for categories of Commons images people really want and would use, we can put those together. I think this has been said before on wikitech-l if not here.
There is another way to cut down on download size, which would serve a whole class of content re-users, e.g. offline readers. 
For offline readers it is not so important to have pictures of 20 Mb each, rather to have pictures at all, preferably 10's Kb's in size. 
A download of all images, scaled down to say 600x600 max would be quite appropriate for many uses. 
Map and diagrams would not survive this scale down (illegible text), but are very compact already. 
In fact the compress ratio of each image is very reliable predictor of the type of content.
In 2005 I distributed a DVD [1] with all unabridged texts for English Wikipedia and all 320,000 images on one DVD, to be loaded on 4Gb CF card for handheld. 
Now we have 10 million images on Commons, so even scaled down images would need some filtering, but any collection would still be 100-1000 times smaller in size.
Erik Zachte
[1] http://www.infodisiac.com/Wikipedia/

Xmldatadumps-l mailing list
Xmldatadumps-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Re: [Xmldatadumps-l] [Wikitech-l] Fwd: Old English Wikipedia image dump from 2005