On 23/10/11 23:19, Fred Zimmerman wrote:
Good points! As the post indicates, I'm not very experienced with any of these tools and made a lot of dummy mistakes. Part of the point of the post is that even if you are pretty dumb, you can get this done if you persevere!
The SQL import of the links would have been helpful; most of the instructions I found recommended using pages-articles.xml, which is a good place to start. There's no need for two copies of the big wiki.xml file, I would suggest just re-naming the original download to enwiki.xml right at the outset.
The file I mentioned to skip was enwiki.sql, which is not xml, but a copy (in SQL queries) of the xml.
xmldatadumps-l@lists.wikimedia.org