[Foundation-l] New parser in the works - please help

David Gerard dgerard at gmail.com
Sat Nov 17 12:06:39 UTC 2007


Wikitext-l was formed from a recent discussion on wikitech-l about the
need to sanely reimplement the current parser, which is a Horrible
Mess and pretty much impossible to reimplement in another language.

The MediaWiki parser definition is literally "whatever the PHP parser
does." Some of what it does is arguably very wrong, pathological,
magical or just a Stupid Parser Trick. So the list has been formed to
come up with a grammar that defines all the useful parts of the
present parser, and so can be used by anyone to implement a MediaWiki
wikitext parser. This will be useful for other software, for WYSIWYG
editing extensions ... all manner of things.

Some of what some people would think of as a "stupid parser trick" is
in fact important - e.g. L'''uomo'' which renders as L<i>uomo</i>
(necessary for French and Italian).

So: we need to know what MediaWiki quirks are supporting important
constructs in languages other than English (which is the language the
list is in, and is the native language of most of the participants),
and particularly in non-European languages.

This list is unlikely to implement new features, e.g. (an example
brought up by GerardM) the double-apostrophe in Neapolitan. But we
really need to know about present important features that wouldn't be
obvious to an English-speaker going through the present parser code.

- d.

