On Fri, Nov 29, 2002 at 03:51:21PM +0100, Tomasz Wegrzanowski wrote:
I was trying to put Math parsing somewhere into 15-passes Wikipedia markup parsing. Because there is no way to protect math markup from interpretation by Wiki, I had to put it before removeHTMLtags. But then, removeHTMLtags breaks <img src="" alt=""> tag. Even if I tell removeHTMLtags to accept IMG ALT and SRC, replaceExternalLinks would kill url in src="".
Any ideas ?
And some day we should make that parser a single-pass LALR ...
Feel free to contribute to the mod_wiki design document; since it is written in C, you can use lex and yacc for the parser. I assumed Clutch was going to use a simple state machine; no overhead from regex libraries, and lex and yacc have always been fairly finicky. But if they are copacetic to your way of coding, by all means, bring it on.
Beginnings of the design document:
http://www.wikipedia.org/wiki/User:Clutch/mod_wiki
Jonathan