On 11/13/07, Virgil Ierubino virgil.ierubino@gmail.com wrote:
It's clear that BNF can't formulate outside these constraints because if (1) was false, you'd never get to the end of the specification, and (2) can't be false simply because it restates (1) but also allows for languages with nested syntax. If it were false that the nested/complex syntax could be broken down into basic syntax, then that complex syntax would simply BE basic syntax, and you are left with just rule (1).
Wikitext fails these constraints because the construct:
**bullet 1 **bullet 2
which are two successive level 2 bullet points, can't be broken down into less complex syntax. Therefore the "**" construct for a level 2 bullet point must be BASIC syntax, not complex syntax. But seen as Wikitext, just like XHTML etc., allows for bulleted lists with infinite nesting levels, there would be an infinite number of basic constructs (for level 2 list items, level 3, 4, 5, 17, 234, etc). We thus fail constraint number (1).
This was my initial reaction. However, I don't think it's actually that important. Because in fact this:
**foo ##blah
*is* valid syntax. As is this: *foo **blah #*blah
Which means that each line can be parsed on its own merits, then a subsequent pass can perform the code generation. This will likely be the general story for the new parser: a traditional parser model with a couple of hacks to cope with the nuances of wikitext, as opposed to a parser built with hacks from the ground up.
wrong. Our problem is simply that the syntax for adding a bullet at the SAME
level has changed now that we're dealing with another level of bullets - at level 1, another bullet at the same level is "*", but at level 2, another bullet at the same level is "**" - but this means that each level of bullets has its own syntax, meaning each bullet level construct is a basic construct, and therefore that there are an infinite number of constructs.
I think honestly a list element will just be defined as an arbitrary sequence of :, # and *, followed by text. EBNF is incapable of expressing how that sequence should be rendered, but that's not a showstopper.
"*#*" rather than the expected "***". There is thus no hope of defining
Wikitext in BNF unless we exhaustively specify every combination of
There is no hope of *fully defining* Wikitext in BNF...
What this means is that we can't use a basic EBNF parser in all the usual
useful ways. We need new solutions.
What this means is that we can't use *just* a basic EBNF parser. We will need an EBNF parser with some hacks/tweaks/special cases.
Steve