[Foundation-l] Info/Law blog: Using Wikisource as an Alternative Open Access Repository for Legal Scholarship

Anthony wikimail at inbox.org
Tue Jun 23 12:59:36 UTC 2009


On Mon, Jun 22, 2009 at 9:15 PM, Platonides <Platonides at gmail.com> wrote:

> Anthony wrote:
> > (although I still haven't seen the WMF step up
> > to the plate and make it easy for people to make a full history fork, or
> > even to download all the images)
>
> You'll find full history dumps of almost all wikis at
> http://download.wikimedia.org/


Key word being "almost".

Although not trivial, downloading all images is in fact quite easy.


Yep.  All I need is permission.


> But do you have enough space to dedicate?


Not at the moment.  No sense in buying the drives when I don't have
permission to fill them up.


> How many wikis do you want to mirror? Just commons is more than 3 TB...


Commons and En.wikipedia would probably be good for starters.

The main thing I want is permission to scrape en.wikipedia, though.  (Not
really scraping, as I'd probably use the API and Special:Export.  Basically
I just would like someone official to tell me how *fast* I'm allowed to use
the API and Special:Export.  Special:Export especially, because I could
easily overwhelm the servers using that, due to a bug in the script.)

That's the reason so few people were interested in the images when the
> image dump was available.


I downloaded it.  It was well under 1 TB at the time.


More information about the foundation-l mailing list