Re: Cool project for perl programmer - Wikitech-l

28 Feb 2004

Jimmy Wales wrote:

...
  From these pages, it should be possible to get a list
of all their
 article titles.

 These could be matched up against Wikipedia article titles.

 Then we could ask the hypothetical: suppose Wikipedia just snagged the
 same 55,000 topics as Columbia?  How big would the resulting text be? 
I'm taking it!

Just today I've downloaded the en.wikipedia.org database dump. I don't 
have a very fast machine, so it took some time to decompress, and it's 
still busy importing it into the DB. Does anyone know approximately how 
long that takes? (Since it doesn't show any progress meter or anything)

But once that is done, the Perl script will be easy.

Timwi