Re: [Wikitech-l] downloading wikipedia database dumps

9 Jan 2010


      On Sat, Jan 9, 2010 at 7:44 AM, Platonides Platonides@gmail.com wrote:
...
Robert Rohde wrote:
...
Of course, strictly speaking we already provide HTTP access to
everything.  So the real question is how can we make access easier,
more reliable, and less burdensome.  You or someone else suggested an
API for grabbing files and that seems like a good idea.  Ultimately
the best answer may well be to take multiple approaches to accommodate
both people like you who want everything as well as people that want
only more modest collections.
-Robert Rohde
Anthony wrote:
...
The bandwidth-saving way to do things would be to just allow mirrors to use
hotlinking.  Requiring a middle man to temporarily store images (many, and
possibly even most of which will never even be downloaded by end users) just
wastes bandwidth.
There is already a way to instruct a wiki to use images from a foreign
wiki as they are needed. With proper caching.
On 1.16 it will even be much easier, as you will only need to set
$wgUseInstantCommons = true; to use Wikimedia Commons images.
http://www.mediawiki.org/wiki/Manual:$wgUseInstantCommons

Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
I'd really like to underline this last piece, as it's something I feel
we're not promoting as heavily as we should be--with 1.16 making
it a 1-line switch to turn it on, perhaps we should publicize this.
Thanks to work Brion did in 1.13 and I picked up later on, this
ability to use files from Wikimedia Commons (or potentially any
MediaWiki installation). Pointed out above, this has configurable
caching that can be set as aggressively as you'd like.
To mirror Wikipedia these days, all you'd need is the article and
template dumps, point the ForeignAPIRepos at Commons and
enwiki, and you've got yourself a working mirror. No need to dump
the images and reimport them somewhere. Cache the thumbnails
aggressively enough and you'll be hosting the images locally, in
effect.
-Chad

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] downloading wikipedia database dumps