On 5/10/05, Andrew Rodland arodland@entermail.net wrote:
Recently I've been working on parsing wikitext for some projects of my own, and because I think that a general-purpose parser is just a good thing to have. In any case, I've made significant progress on a Perl implementation. I'll link to the code below, but what I think is more interesting is the english. As far as I can tell, it is possible to parse wikitext in a "single pass" fashion, and it's possible to do it quickly (more on that once I finish the last few features I need to enable benchmarking). In fact, I believe that if one is willing to forgo the TOC at the top, it's possible to parse and render incrementally (though for various reasons that's probably not such a great idea). With regard to http://www.usemod.com/cgi-bin/mb.pl?ConsumeParseRenderVsMatchTransform my code is of the "consume/parse/render" variety.
Anyway, I'll stop rambling on about it, and get to the point. At this point, I doubt that my code is clean enough or useful enough for anyone else to make use of it, but I'm mentioning it in case it provides any insight or grounds for discussion, or in case anyone would like to base work off of it or make suggestions.
The code is part of BerliOS project "wikioncd"; svn is at svn://svn.berlios.de/wikioncd/trunk/wikioncd (the parser, with a temporary driver, is in parser.pl); the ViewCVS for same is at http://svn.berlios.de/viewcvs/wikioncd/trunk/wikioncd/ .
Could you please add your parser implementation to this list: http://meta.wikimedia.org/wiki/Alternative_parsers