On 17/07/11 19:16, Brion Vibber wrote:
Output comparisons can be a tricky business, but it'll be an important component. I've started on a CLI batch testing framework for the JavaScript parser class (using node.js; in ParserPlayground/tests) that can read through a Wikipedia XML dump and run round-tripping checks; moving on to run HTML output and comparing against output from current MediaWiki parser will be very valuable (though doing comparisons on HTML is tricky!)
-- brion
There's a rough script for comparing parsers in maintenance folder, but having one parser in php and the other in javascript, it can be hard to do well. Spawning a process for each article would be slow... Perhaps using the spidermonkey pecl extension [1]?