[Foundation-l] EBNF of Wikitext

Wed Jan 24 18:44:23 UTC 2007

On Mon, 22 Jan 2007 20:31:44 +0000, Mark Clements wrote:

> "Virgil Ierubino" <virgil.ierubino at gmail.com>
> wrote in message
> news:24dce9db0701211957k752c1b69oec2d75ffbb4e5a68 at mail.gmail.com...
>> On 1/22/07, Gerard Meijssen
> <gerard.meijssen at gmail.com> wrote:
>> > Personally, I'm intrigued. Virgil, could you elaborate on the purpose
>> > of this project? In what ways can it help us (and who exactly is the
>> > 'us' in this case :))?
>>
>> As explained already, an EBNF of Wikitext will allow for expansion of
>> Wikitext easily as it's conforming to a standard, and a more efficient
>> parser, etc. I'm not the expert in WHY it's good though, I just know it
> is.
>> And I know how to write EBNFs, and I can't work out how to code Wikitext's
>> bullet points in EBNF - if EBNF can't handle Wikitext, then the efficient
>> parsing and expansion won't be possible (as easily).
>>
>> Would very much appreciate help from anyone clued up as to how to proceed
> if
>> Wikitext can't be EBNFed, or who can show me that it, in fact, can.
>>
>> http://meta.wikimedia.org/wiki/Wikitext_Metasyntax
> 
> I missed the beginning of this thread for some reason.
> 
> Please also see http://www.mediawiki.org/wiki/Markup_spec and its sub-pages
> for some work that has already been done in this area.  It would be useful
> if you could merge your new page on meta into this existing content.
> 
> - Mark Clements (HappyDog)

To try and answer some of what was said before;

No, I don't think wiki text can be fully EBNFed.  Basically, an EBNF is a
formal specification for a well defined grammar.  The idea is that the
grammar is defined by the specification, and parsers can be written for
it.  

Currently, wiki follows the perl model; *the* parser defines the
grammar, and any ambiguities are resolved by seeing how the parser deals
with them.

The way to proceed would be to first specify (EBNF) the unambiguous parts
that can be easily specified, then tweak the parser so that the
grammar would match the specification.

The advantage of doing this is that the grammar is then tied to a formal
specification, instead of a parser implementation, so different
implementations could parse the exact same grammar.  (i.e. currently, if
you had parsers in php, perl, & C, they would handle a few corner cases
based on peculiarities of the language, so they would technically
implement 3 slightly different grammars, which would hopefully be close
enough).