Hello,
Is the bandwidth used really a big problem? Bandwidth is pretty cheap these days, and given Wikipedia's total draw, I suspect the occasional dump download isn't much of a problem.
I am not sure about the cost of the bandwidth, but the wikipedia image dumps are no longer available on the wikipedia dump anyway. I am guessing they were removed partly because of the bandwidth cost, or else image licensing issues perhaps.
from: http://en.wikipedia.org/wiki/Wikipedia_database#Images_and_uploaded_files
"Currently Wikipedia does not allow or provide facilities to download all images. As of 17 May 2007 (2007 -05-17)[update], Wikipedia disabled or neglected all viable bulk downloads of images including torrent trackers. Therefore, there is no way to download image dumps other than scraping Wikipedia pages up or using Wikix, which converts a database dump into a series of scripts to fetch the images.
Unlike most article text, images are not necessarily licensed under the GFDL & CC-BY-SA-3.0. They may be under one of many free licenses, in the public domain, believed to be fair use, or even copyright infringements (which should be deleted). In particular, use of fair use images outside the context of Wikipedia or similar works may be illegal. Images under most licenses require a credit, and possibly other attached copyright information. This information is included in image description pages, which are part of the text dumps available from download.wikimedia.org. In conclusion, download these images at your own risk (Legal)"
Bittorrent's real strength is when a lot of people want to download the same thing at once. E.g., when a new Ubuntu release comes out. Since Bittorrent requires all downloaders to be uploaders, it turns the flood of users into a benefit. But unless somebody has stats otherwise, I'd guess that isn't the problem here.
Bittorrent is simply a more efficient method to distribute files, especially if the much larger wikipedia image files were made available again. The last dump from english wikipedia including images is over 200GB but is understandably not available for download. Even if there are only 10 people per month who download these large files, bittorrent should be able to reduce the bandwidth cost to wikipedia significantly. Also I think that having bittorrent setup for this would cost wikipedia a small amount, and may save money in the long run, as well as encourage people to experiment with offline encyclopedia usage etc. To make people have to crawl wikipedia with Wikix if they want to download the images is a bad solution, as it means that the images are downloaded inefficiently. Also one wikix user reported that his download connection was cutoff by a wikipedia admin for "remote downloading".
Unless there are legal reasons for not allowing images to be downloaded, I think the wikipedia image files should be made available for efficient download again. However since wikix can theoretically be used to download the images, I think it would also be legal to allow the image dump to be downloaded as well, thoughts?
cheers, Jamie
William
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l