On Thu, 08 Nov 2007 14:55:02 +1100, Steve Bennett wrote:
On 11/8/07, Simetrical Simetrical+wikilist@gmail.com wrote:
- Now that we have a grammar, a yacc parser is compiled, and
appropriate rendering bits are added to get it to render to HTML.
People have already done this, at least once, haven't they? Do we have a list of attempts?
- The stuff the BNF grammar doesn't cover is tacked on with some
other methods. In practice, it seems like a two-pass parser would be ideal: one recursive pass to deal with templates and other substitution-type things, then a second pass with the actual grammar of most of the language. The first pass is of necessity recursive, so there's probably no point in having it spend the time to repeatedly parse italics or whatever, when it's just going to have to do it again when it substitutes stuff in. Further rendering passes are going to be needed, e.g., to insert the table of contents. Further parsing passes may or may not be needed.
Ouch, now you're up to about 4 passes, which isn't far off the current version. Two passes would be good, like a C compiler: once for meta-markup (templates, parser functions), and once for content. Would it be possible to perhaps have an in-place pattern-based parser for the first phase, then a proper recursive descent for the content?
If we're going to get anything done, we'd likely need incremental improvements (or real strong motivation). Yes, four passes is a little more complex than we'd want, and would make the grammar a bit unwieldy. But it's not complex or unwieldy enough to handle all of the corner cases in the current parser. So it would at least be a step in the right direction.