Hi All,
Commons.wikimedia.org is growing and provides a quite complete set of media files including a lot of interesting historical documents. Contributors are relying on the availability and persistence of commons.wikimedia.org but currently the full export is only available on download.wikimedia.org (ok not Today ;-).
I was wondering if it would be possible to allow web robots to access http://upload.wikimedia.org/wikipedia/commons/ to gather and mirror the media files. As this is pure HTTP, the mirroring could benefit from the caching mechanisms of HTTP object (instead of having a large dump containing all the media files, that is more difficult to cache/update).
Maybe this could allow a more distributed backup approach to ensure the resilience of commons.wikimedia.org?
Thanks a lot for your work,
adulau