Hm. Sounds like an opportunity. How about Mediawiki issuing a grand challenge. Create a well-documented/structured (open source) parser that produces the same results as the current parser on 98% of Wikipedia pages. The prize is bragging rights and a letter of commendation from someone or other. I suspect there are a bunch of graduate students out there that would find the challenge interesting.
Rationalizing the parser would help the development process. For the 2% of the pages that fail, challenge others to fix them. They key is not getting stuck in the "we need a formal syntax" debate. If the challengers want to create a formal syntax that is up to them. Mediawiki should only be interested in the final results.
--- On Tue, 7/14/09, Aryeh Gregor Simetrical+wikilist@gmail.com wrote:
They're supposed to pass, in theory, but never have. Someone wrote the tests and the expected output at some point as a sort of to-do list. I don't know why we keep them, since they just confuse everything and make life difficult. (Using the --record and --compare options helps, but they're not that convenient.) All of them would require monkeying around with the parser that nobody's willing to do, since the parser is a hideous mess that no one understands or wants to deal with unless absolutely necessary.
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l