On Fri, Sep 25, 2009 at 8:38 AM, Happy-melon <happy-melon(a)live.com> wrote:
The 10% drove people off cliffs because it is, pretty
much by definition,
the horrible unexpected behaviour that is a *consequence* of not having a
formal definition. Writing a formal definition is not impossible if you
require that it be sensible at the final reading. The parser is, in many
places, *not* sensible, and naturally those quirks are difficult to
describe, but they're also undesirable overall. A true move to a formal
language definition involves action from both ends: writing a formal
definition that follows the current parser in general, *and* being prepared
to alter the parser to remove some of the more egregious deviations from
expected behaviour.
I just wanted to state for the record that when we were talking about
this last time, the developers (Brion included) were actually quite
open to the idea of the semantics of wikitext changing if they weren't
widely used. In other words, it was ok to build a new parser which was
incompatible with the old parser, as long as that didn't break too
much existing wikitext ("too much" being in the order of 1 or 2% of
articles).
Another comment:
The problem is the ambiguity with italics,
(''italics''). So the
current parser doesn't really make its final decision on what should
be bold or what should be italic until it hits a newline. If there are
an even number of both bold and italics then it assumes it interpreted
the line correctly.
...
I think this is part of what makes wikitext
undescribable in a formal
grammar.
Yeah, but from memory, using ANTLR's formal-grammar-breaking features,
this wasn't a massive problem. A small, annoying one, to be sure, but
not a killer. It does tend to mean potentially a lot of back-tracking
though, which is slow...
Steve