On 12/21/2007 05:02 PM, Lars Aronsson wrote:
Hi Johann,
When I just want current pages with no discussion and no history but with all the templates and category lists working correctly, which of all the files in http://download.wikimedia.org/enwiki/latest/
That should be the pages-articles.xml "the one most people will want".
What puzzles me about that is that there are so many other files in the download dir ... don't I need any of these?
When I last tried that (I downloaded pages-articles.xml, converted with Xml2sql and imported with mysqlimport), templates did not show properly and no category lists where available at all.
do I have to download and how do I have to import the data so that the import is complete and fast?
Fast? Haha, no, it won't be.
You should try it out on a smaller language than English. Try Faroese (fo) with 2,700 articles or Anglosaxon (ang) with 900 articles.
Well, I meant *relatively fast* :) It seems there are different ways to do the import which might be differently fast, given the same dump.
I also wonder if there are any settings that would speed up the import into the local DB ... e.g., I turned off the binary log of Mysql which seemed to speed up things. I only want to read the local copy, not update anything.
Cheers, Johann
wikitech-l@lists.wikimedia.org