On 12/21/2007 05:02 PM, Lars Aronsson wrote:
Hi Johann,
When I just want current pages with no discussion
and
no history but with all the templates and category lists
working correctly, which of all the files in
http://download.wikimedia.org/enwiki/latest/
That should be the pages-articles.xml
"the one most people will want".
What puzzles me about that is that there are so many
other files in the download dir ... don't I need any
of these?
When I last tried that (I downloaded pages-articles.xml,
converted with Xml2sql and imported with mysqlimport),
templates did not show properly
and no category lists where available at all.
do I have to
download and
how do I have to import the data so that the import
is complete and fast?
Fast? Haha, no, it won't be.
You should try it out on a smaller language than English.
Try Faroese (fo) with 2,700 articles or Anglosaxon (ang) with 900
articles.
Well, I meant *relatively fast* :)
It seems there are different ways to do the import which
might be differently fast, given the same dump.
I also wonder if there are any settings that would
speed up the import into the local DB ... e.g., I
turned off the binary log of Mysql which seemed to speed
up things. I only want to read the local copy, not
update anything.
Cheers,
Johann