2010-09-27 22:46, Paul Houle skrev:
On 9/27/2010 2:58 PM, Chad wrote:
This. Tim sums up the consensus very well with
that commit summary.
He also made some comments on the history of wikitext and alternative
parsers on foundation-l back in Jan '09[0]. Worth a read (starting mainly
at ""Parser" is a convenient and short name for it").
While a real parser is a nice pipe dream, in practice not a single project
to "rewrite the parser" has succeeded in the years of people trying. Like
Aryeh says, if you can pull it off and make it practical, hats off to you.
For my own IX work I've written a wikimedia markup parser in C#
based on the Irony framework. It fails to parse about 0.5% of pages in
wikipedia
What do you mean with "fail". It assigns slightly incorrect semantic to a
construction? It fails to accept the input? It crashes?
and is oblivious to a lot of the stranger stuff
[like the HTML
intrusions] but it does a good job of eating infoboxes and making sense
of internal and external links. Now, the strange stuff + the parse
fails would probably be impossible to handle in a rational way...
I disagree. I believe that there is a rational way to handle all kinds
of input.
/Andreas