I stripped out the <redirect />'s and imported enwiki using xml2sql, but none of the templates rendered correctly--for example, navigating to /The_Matrix results in a page with lots of mediawiki source like
{{#if: |This {{#ifeq:||article|page}} is about . }}For {{#if:the series|the series|other uses}}, see {{#if:The Matrix (franchise)|The Matrix (franchise){{#ifeq:the setting|and| and {{#if:Matrix (fictional universe)|Matrix (fictional
Any ideas if this is a known problem with xml2sql, or did something get corrupted during my import? I haven't yet tried importDump.php because it seems to be extremely slow (can only import a few pages per second)
Eric
On Fri, Feb 5, 2010 at 1:13 AM, Andrew Krizhanovsky andrew.krizhanovsky@gmail.com wrote:
Yes, it was safe in my case (import of Russian and English Wiktionary). See http://meta.wikimedia.org/wiki/Talk:Xml2sql and example of script or shell command to strip out the <redirect />
-- Andrew.
On Fri, Feb 5, 2010 at 6:38 AM, Eric Sun esun@cs.stanford.edu wrote:
Would it be safe to strip out the <redirect /> tags from the xml and reimport, or will that cause other problems?
Thanks, Eric
On Thu, Feb 4, 2010 at 6:24 PM, Chad innocentkiller@gmail.com wrote:
On Thu, Feb 4, 2010 at 9:12 PM, Eric Sun esun@cs.stanford.edu wrote:
Hi,
I saw this thread back in October where someone was having trouble importing the English Wikipedia XML dump: http://lists.wikimedia.org/pipermail/wikitech-l/2009-October/045594.html The thread back in October seemed to end without resolution, and the tools still seem to be broken, so has anyone found a solution in the meantime?
I'm using mediawiki-1.15.1 and attempting to import enwiki-20100130-pages-articles.xml.bz2.
None of these options seem to work:
- importDump.php
fails by spewing "Warning: xml_parse(): Unable to call handler in_() in ./includes/Import.php on line 437" repeatedly
- xml2sql (http://meta.wikimedia.org/wiki/Xml2sql):
Fails with error: xml2sql: parsing aborted at line 33 pos 16. due to the new <redirect> tag introduced in the new dumps?
- mwdumper (http://www.mediawiki.org/wiki/MWDumper):
Current XML is schema v0.4, but the documentation says that it's for 0.3
Fails immediately: siteinfo: untested generator 'MediaWiki 1.16alpha-wmf', expect trouble
ahead
page: expected closing tag in line 35
Any tips? Thanks! Eric
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Most of these errors are caused by the new(ish) <redirect /> tag within <page> elements. 0.4 is the correct version of the schema, but unfortunately the schema was updated and dumps were produced using them before the changes made it into a release.
1.15.1 cannot import pages with <redirect />, we should probably backport that. That, and we should rewrite the importers to not barf terribly when they encounter an unknown element.
-Chad
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l