On 9/28/2010 3:53 AM, Andreas Jonsson wrote:
For my
own IX work I've written a wikimedia markup parser in C#
based on the Irony framework. It fails to parse about 0.5% of pages in
wikipedia
What do you mean with "fail". It assigns slightly incorrect
semantic to a
construction? It fails to accept the input? It crashes?
Fails to accept input -- that is, the text doesn't match the grammar.
Now, the toolchain above the parser gets between 30-80% recall at the
moment doing the things it has to do, so making the grammar better
isn't the highest priority on my list.