On Fri, Sep 25, 2009 at 8:38 AM, Happy-melon <happy-melon(a)live.com> wrote:
> The 10% drove people off cliffs because it is, pretty much by definition,
> the horrible unexpected behaviour that is a *consequence* of not having a
> formal definition. Writing a formal definition is not impossible if you
> require that it be sensible at the final reading. The parser is, in many
> places, *not* sensible, and naturally those quirks are difficult to
> describe, but they're also undesirable overall. A true move to a formal
> language definition involves action from both ends: writing a formal
> definition that follows the current parser in general, *and* being prepared
> to alter the parser to remove some of the more egregious deviations from
> expected behaviour.
I just wanted to state for the record that when we were talking about
this last time, the developers (Brion included) were actually quite
open to the idea of the semantics of wikitext changing if they weren't
widely used. In other words, it was ok to build a new parser which was
incompatible with the old parser, as long as that didn't break too
much existing wikitext ("too much" being in the order of 1 or 2% of
articles).
Another comment:
>The problem is the ambiguity with italics, (''italics''). So the
>current parser doesn't really make its final decision on what should
>be bold or what should be italic until it hits a newline. If there are
>an even number of both bold and italics then it assumes it interpreted
>the line correctly.
...
>I think this is part of what makes wikitext undescribable in a formal
>grammar.
Yeah, but from memory, using ANTLR's formal-grammar-breaking features,
this wasn't a massive problem. A small, annoying one, to be sure, but
not a killer. It does tend to mean potentially a lot of back-tracking
though, which is slow...
Steve