Jan Hidders wrote:
On Monday 16 August 2004 19:35, Magnus Manske wrote:
So, what you'd need is an EBNF representation or something?
Mind if I jump in and suggest a substitution for the "or something" alternative? :) Check out "parsing expression grammar" on wikipedia (and in more detail on the external links that article leads to). Although at the moment you probably won't find a parser generator that'll generate the PHP code you want from it, if the primary goal here is formal specification then that's not such an issue - and in any case, unlike (LA)LR CFGs, parsing expression grammars tend to be very easy to convert manually into working parsers.
Yes, that is all I ask. A precise formal grammar in whatever notation you like but preferrably in the input format of bison. If only because that would document what it is that your parser exactly does. Note there are some requirements:
- it should accept all possible input strings
- it should ideally do all the necessary parsing i.e. from the parse tree we
should be able to generate the output with a single tree-walk, and
- it should be unambiguous (and even LALR(1)) or have an explicit conflict
resolution rule
The first and third requirements above are likely to be _very_ difficult to achieve at the same time in a CFG paradigm, because of the limited lookahead and extremely rudimentary disambiguation facilities of LR-class parser generators. LR parser generators are designed for languages that were designed for LR parser generators; they tend to be difficult or impossible to use for more freeform languages such as wikitext without making some serious compromises or horrible hacks. Not to discourage you from trying, though; I just want to point out an alternative that you might consider when the shift-reduce and reduce-reduce conflicts start becoming unbearable. :)
One caveat, though: I'm not exactly unbiased, since I wrote much of the aforementioned stuff on parsing expression grammars. :)
OK, I'll shut up now.
Cheers, Bryan