Posted the bug here:
https://issues.apache.org/jira/browse/XERCESJ-1614
Now we just have to wait and see. In the meantime I'll try to use the official mediawiki importer and see if that works.
Michael
On Tue, May 21, 2013 at 6:30 PM, Ariel T. Glenn ariel@wikimedia.org wrote:
I think this will do it:
http://xerces.apache.org/xerces2-j/jira.html
Ariel
Στις 21-05-2013, ημέρα Τρι, και ώρα 17:53 +0200, ο/η Michael Tsikerdekis έγραψε:
Thanks Ariel. One small thing, where exactly can I report it upstream?
got
a url?
Michael
On Tue, May 21, 2013 at 5:45 PM, Ariel T. Glenn ariel@wikimedia.org
wrote:
If you can stomach it I would report it upstream, linking to the
earlier
version of the bug they had with a proposed patch etc. I can give them a test file consisting of the one page with all its revisions, "only" 170 mb uncompressed :-D
It's fine to open a report locally too in mwdumper and link the
upstream
report.
Thanks,
Ariel
Στις 21-05-2013, ημέρα Τρι, και ώρα 15:57 +0200, ο/η Michael
Tsikerdekis
έγραψε:
Update on the matter. I've edited pom.xml and changed xerces version
which
was set to 2.7.1 to 2.9.1, 2.11.0, 2.8.0 and other versions.
The out of bound error becomes different on later versions but still
the
error persists. Also, I tried to use mwdumper with an older version of wikipedia
dump:
The error still appears on the first file this time: enwiki-20130102-pages-meta-history1.xml-p000000010p000002070.7z
Should I report a new bug on bugzilla for mwdumper?
Michael
MediaWiki-l mailing list MediaWiki-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-l