Jeffrey V. Merkey wrote:
After running mwdumper to strip out the NS_MEDIAWIKI namespace entries
from the 20070206 SQL dumps, the output filtered
XML file created by mwdumper after stripping out the name space has
sql syntax errors in the output:
1. The output files cannot be used by mwimport because the text
labels for XML types, etc, are modified by mwdumper to the extent the
program can no longer read the dump.
2. If you take the XML file output by mwdumper and attempt to
reimport it into an empty database with mwdumper, it produces
corrupted SQL statements
and fails. It will process about 720,000 articles however, before
failing. Output from mysql error log provided.
ERROR 1062 (23000) at line 15327: Duplicate entry '70473566' for key 1
ERROR 1064 (42000) at line 15328: You have an error in your SQL
syntax; check the manual that corresponds to your MySQL server version
for the right syntax to use near ''== Greatest Common Factor / Least
Common Multiple Problem ==\n\nI recently went' at line 1
Jeff
NOTE: These errors occur with the standard English Dumps with the
existing tools. These dumps are not the modified dumps created by the
machine translator.
Jeff