[Wikitech-l] commons.wikimedia.org allowing directory indexes and web robots

18 Jul 2009


      Hi All,
Commons.wikimedia.org is growing and provides a quite complete set
of media files including a lot of interesting historical documents.
Contributors are relying on the availability and persistence of 
commons.wikimedia.org but currently the full export is only 
available on download.wikimedia.org (ok not Today ;-).
I was wondering if it would be possible to allow web robots to access
http://upload.wikimedia.org/wikipedia/commons/ to gather and mirror
the media files. As this is pure HTTP, the mirroring could benefit from
the caching mechanisms of HTTP object (instead of having a large dump
containing all the media files, that is more difficult to cache/update).
Maybe this could allow a more distributed backup approach to ensure
the resilience of commons.wikimedia.org?
Thanks a lot for your work,
adulau
-- 
--                   Alexandre Dulaunoy (adulau) -- http://www.foo.be/
--                             http://www.foo.be/cgi-bin/wiki.pl/Diary
--         "Knowledge can create problems, it is not through ignorance
--                                that we can solve them" Isaac Asimov

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

[Wikitech-l] commons.wikimedia.org allowing directory indexes and web robots