Jan Hidders wrote:
>On Monday 16 August 2004 19:35, Magnus Manske wrote:
>> So, what you'd need is an EBNF representation or something?
Mind if I jump in and suggest a substitution for the "or something"
alternative? :) Check out "parsing expression grammar" on wikipedia (and in
more detail on the external links that article leads to). Although at the
moment you probably won't find a parser generator that'll generate the PHP
code you want from it, if the primary goal here is formal specification then
that's not such an issue - and in any case, unlike (LA)LR CFGs, parsing
expression grammars tend to be very easy to convert manually into working
parsers.
>Yes, that is all I ask. A precise formal grammar in whatever notation you
>like
>but preferrably in the input format of bison. If only because that would
>document what it is that your parser exactly does. Note there are some
>requirements:
>- it should accept all possible input strings
>- it should ideally do all the necessary parsing i.e. from the parse tree we
>should be able to generate the output with a single tree-walk, and
>- it should be unambiguous (and even LALR(1)) or have an explicit conflict
>resolution rule
The first and third requirements above are likely to be _very_ difficult to
achieve at the same time in a CFG paradigm, because of the limited lookahead
and extremely rudimentary disambiguation facilities of LR-class parser
generators. LR parser generators are designed for languages that were
designed for LR parser generators; they tend to be difficult or impossible to
use for more freeform languages such as wikitext without making some serious
compromises or horrible hacks. Not to discourage you from trying, though; I
just want to point out an alternative that you might consider when the
shift-reduce and reduce-reduce conflicts start becoming unbearable. :)
One caveat, though: I'm not exactly unbiased, since I wrote much of the
aforementioned stuff on parsing expression grammars. :)
OK, I'll shut up now.
Cheers,
Bryan