Probably, if anybody ever wants this kind of functionality done, we need to direct them to start helping us defining the parser behaviour.
Well I've been advocating that since the second I heard of such projects. If they start doing it, let me know. ;)
I think a lot of people have *started* doing it. It's *finishing* that's the tricky bit. :P
As one of the many people who's done so, I agree. :) The problem is that ~80% of wikimarkup is pretty straightforward to parse using standard methods, another 10-15% can be done without huge difficulty using known-but-less-standard methods, and the remaining 5% doesn't fit well at all into any of the normal models of lexing/parsing.
[...snip...]
-Mark
Can I maybe suggest please giving some examples that you encountered of the 10-15% hard category, and the 5% very hard category?
I ask so that if anyone feels tempted to start on defining the behaviour, we can gently suggest doing the harder stuff *first* (with examples), thus hopefully preventing the situation where we have multiple unfinished 80%-done definitions, and no 100%-complete formal definitions.
All the best, Nick.