On 13/11/2007, Thomas Dalton <thomas.dalton(a)gmail.com> wrote:
> To answer my own question, I don't think 2)
is possible, due to the
> legitimacy of constructs like:
> Here is some ''italics with a [[link|that switches ''off]] the
italics.
That's arguably pathological and confusing even as wikitext. If I saw
that I'd probably make it clearer by hand.
That's a problem we're going to come across a
lot. Most parsers solve
it by requiring the syntax to be well-formed. Since wikitext is meant
to be a foolproof as possible, we don't want to make that requirement,
which means we have to write a parser that can understand a terrible
mess of tokens.
Yes. Throwing an error is absolutely unacceptable if we're going to
put this in front of the technophobes who muddle through at present.
All strings must be "valid", even if kept from doing wacky things.
The only alternative I can think of is running the
wikitext through a
tidier first that detects that kind of mess and adds the appropriate
close and reopen tags. It requires an extra pass through the text, but
might be unavoidable.
Definitely, and arguably enhances comprehension of the text. We need
such a pass to keep [[text (bracket)|]] and ~~~~ expansion working in
any case.
Basically, we accept that wikitext can't be
described by EBNF, so start by parsing the wikitext into a more
restrictive form of wikitext which can be described by EBNF, and then
parsing that. It's a mess, but it's probably better than what we have
at the moment.
Or that way around, yes :-)
- d.