I think this will do it:
http://xerces.apache.org/xerces2-j/jira.html
Ariel
Στις 21-05-2013, ημέρα Τρι, και ώρα 17:53 +0200, ο/η Michael Tsikerdekis έγραψε:
Thanks Ariel. One small thing, where exactly can I report it upstream? got a url?
Michael
On Tue, May 21, 2013 at 5:45 PM, Ariel T. Glenn ariel@wikimedia.org wrote:
If you can stomach it I would report it upstream, linking to the earlier version of the bug they had with a proposed patch etc. I can give them a test file consisting of the one page with all its revisions, "only" 170 mb uncompressed :-D
It's fine to open a report locally too in mwdumper and link the upstream report.
Thanks,
Ariel
Στις 21-05-2013, ημέρα Τρι, και ώρα 15:57 +0200, ο/η Michael Tsikerdekis έγραψε:
Update on the matter. I've edited pom.xml and changed xerces version
which
was set to 2.7.1 to 2.9.1, 2.11.0, 2.8.0 and other versions.
The out of bound error becomes different on later versions but still the error persists. Also, I tried to use mwdumper with an older version of wikipedia dump: 20130102.
The error still appears on the first file this time: enwiki-20130102-pages-meta-history1.xml-p000000010p000002070.7z
Should I report a new bug on bugzilla for mwdumper?
Michael