Bryan Ford wrote:
- it should accept all possible input strings
- it should ideally do all the necessary parsing i.e. from the parse tree we
should be able to generate the output with a single tree-walk, and
- it should be unambiguous (and even LALR(1)) or have an explicit conflict
resolution rule
The first and third requirements above are likely to be _very_ difficult to achieve at the same time in a CFG paradigm, because of the limited lookahead and extremely rudimentary disambiguation facilities of LR-class parser generators. LR parser generators are designed for languages that were designed for LR parser generators; they tend to be difficult or impossible to use for more freeform languages such as wikitext without making some serious compromises or horrible hacks.
Can you provide a more specific example of this? So far, I have not encountered a situation where accepting all possible input strings would be hard to do. In fact, all I need to do to ensure this is to allow the "text" non-terminal to contain any token.
Also, I have not found it difficult to craft the grammar in such a way that bison's default disambiguation rules (conflict resolution rules) produce the correct result. There does not seem to be any real need to have the grammar be unambiguous.
I just want to point out an alternative that you might consider when the shift-reduce and reduce-reduce conflicts start becoming unbearable. :)
So far, I have not had a reduce/reduce conflict, and, as I said, the shift/reduce conflicts are not a problem.
Timwi