Hi,
I want to download the english wikipedia database and further process the data. The sql dump is approx 30G in compressed format. Currently I have setup MYSQL to import small database dumps in other languages. What is the best and quickest approach to import the english database to MySQL? My understanding is that I will have to download all compressed files from http://download.wikimedia.org/wikipedia/en/ and cat them, so I can gunzip them in order to import them to MySQL. Can I import subsets of the databases without downloading all the dumps? If so, where can I find these files ?
Is there any other ultility to convert the sql dumps to html apart from wiki2static.pl script?
I am a new user and any help or advise will be very useful.
Thanks, -Hemali
To answer one question out of several:
On 4/19/05, Hemali Majithia majithia@streamsage.com wrote:
Is there any other ultility to convert the sql dumps to html apart from wiki2static.pl script?
There are a couple of other tools like this listed among http://meta.wikimedia.org/wiki/Alternate_parsers I have no idea though which are good, which are "actively supported", or even which work.
On 4/19/05, Rowan Collins rowan.collins@gmail.com wrote:
The correct URL is http://meta.wikimedia.org/wiki/Alternative_parsers
Alternate != Alternative ;-)
-- David Iberri
On 4/19/05, David Iberri diberri@gmail.com wrote:
On 4/19/05, Rowan Collins rowan.collins@gmail.com wrote:
The correct URL is http://meta.wikimedia.org/wiki/Alternative_parsers
Alternate != Alternative ;-)
D'oh! I take it I got it right when I created the page in the first place though? ;)
On Tuesday, April 19, 2005 11:04 PM, Rowan Collins rowan.collins@gmail.com wrote:
On 4/19/05, David Iberri diberri@gmail.com wrote:
On 4/19/05, Rowan Collins rowan.collins@gmail.com wrote:
The correct URL is http://meta.wikimedia.org/wiki/Alternative_parsers
Alternate != Alternative ;-)
D'oh! I take it I got it right when I created the page in the first place though? ;)
Actually, I quite fancy seeing an "alternate parser", which would presumably be something that parses every other piece of syntax. :-)
Yours,
wikitech-l@lists.wikimedia.org