great! at least we know what's causing it. I've seen the thread about
xerces before but it was too old so I thought there is probably no relation.
Let me know when there is a new build to try out or anything else I can do
to help fix the problem.
Michael
On Mon, May 20, 2013 at 4:41 PM, Ariel T. Glenn <ariel(a)wikimedia.org> wrote:
Στις 20-05-2013, ημέρα Δευ, και ώρα 13:18 +0200, ο/η
Michael Tsikerdekis
έγραψε:
33 pages (0.593/sec), 25,374 revs (455.695/sec)
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 2048
at org.apache.xerces.impl.io.UTF8Reader.read(Unknown Source)
at org.apache.xerces.impl.XMLEntityScanner.load(Unknown Source)
at org.apache.xerces.impl.XMLEntityScanner.scanContent(Unknown
Source)
at
org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanContent(Unknown
Source)
at
org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown
Source)
at
org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown
Source)
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown
Source)
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown
Source)
at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown
Source)
at
org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown
Source)
...
The file itself is fine; proof of that is that I isolated the
problematic page, removed the first revision (which had been processed
without problems) and then all remaining revisions including the 'bad'
one were handled properly.
This is most likely a regression:
http://www.gossamer-threads.com/lists/wiki/mediawiki/128069
Our spec says to build against maven's xerces version 2.7.1, and I
expect that never got the patch [1]. I'm not sure what version of the
xerces library is good ([2]).
I'm adding Chad back on the cc though since he'll have to update the
build specs. Chad, do you want a bugzilla report for this?
Ariel
[1]
http://www.gossamer-threads.com/lists/wiki/mediawiki/128069
[2]
https://issues.apache.org/jira/browse/XERCESJ-1257?page=com.atlassian.jira.…
_______________________________________________
MediaWiki-l mailing list
MediaWiki-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-l