[Mediawiki-l] Mwdumper crashes on dump enwiki-20070527

Rolf Lampa rolf.lampa at rilnet.com
Thu Jul 5 20:57:02 UTC 2007


Hi all,

Problems with mwdumper

Mwdumper (http://www.mediawiki.org/wiki/Mwdumper) crashes (around 35000 pages)
when processing the en-WP dump as of 2007-05-27, with the following error:

root at xubuntu-svn:/home/admin/Desktop# jdk1.5.0_12/bin/java -jar mwdumper.jar
--format=sql:1.5 enwp-200707 > enwp-200707.sql 
  ...
32,000 pages (373.893/sec), 32,000 revs (373.893/sec)
33,000 pages (373.206/sec), 33,000 revs (373.206/sec)
34,000 pages (377.979/sec), 34,000 revs (377.979/sec)
35,000 pages (377.851/sec), 35,000 revs (377.851/sec)
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 2048
        at org.apache.xerces.impl.io.UTF8Reader.read(Unknown Source)
        at org.apache.xerces.impl.XMLEntityScanner.load(Unknown Source)
        at org.apache.xerces.impl.XMLEntityScanner.skipChar(Unknown Source)
        at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$Frag
                               mentContentDispatcher.dispatch(Unknown Source)
        at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scan
                                                     Document(Unknown Source)
        at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
        at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
        at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
        at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
        at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
        at javax.xml.parsers.SAXParser.parse(SAXParser.java:375)
        at javax.xml.parsers.SAXParser.parse(SAXParser.java:176)
        at org.mediawiki.importer.XmlDumpReader.readDump(Unknown Source)
        at org.mediawiki.dumper.Dumper.main(Unknown Source)
root at xubuntu:/home/admin/Desktop# 

More info about the environment:

Java version: root at xubuntu:/home/admin/Desktop# sudo ./jdk1.5.0_12/bin/java -version
java version "1.5.0_12"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_12-b04)
Java HotSpot(TM) Client VM (build 1.5.0_12-b04, mixed mode, sharing)

OS: GNU/Linux Xubuntu 6.10
Kernel release: 2.6.17-10-generic, 
Kernel version: #2 SMP Fri Oct 13 18:45:35 UTC 2006

Any ideas anyone?

Regards,

// Rolf Lampa






More information about the MediaWiki-l mailing list