Lars Aronsson wrote:
Steve Bennett wrote:
Just to check: Is changing MediaWiki sytnax absolutely out of the question? Using // and ** for italic and bold respectively would solve that problem, be more consistent and more intuitive, and probably not be excessively difficult to phase in, right?
Many would say that any such discussion is a dead end. However, I think that is a bit narrow-minded. A syntax change on Wikipedia might not be very likely, but you can of course change MediaWiki for use on your own wiki website.
Changing to // and ** doesn't necessarily make the wiki syntax any more BNF parsable than today, does it? You can still write ***** and what's that supposed to mean? If clarity is wanted, the best would probably be <i> and <b>.
Actually, // and ** are at least as clear, and are most definitely parsable by a fixed-lookahead context-free grammar - even an unaugmented LL(k) grammar could probably handle it. <i> and <b> are unambiguous, but ugly and language-dependent. MediaWiki's current behavior "fixes" many of the issues with its ambiguous bold/italics representation with little ad-hoc DWIM-type behavior. It works, but cannot be represented by a CFG and is difficult to extend.
Oh - and ***** would most likely be disambiguated to <b>*</b>. Easy to handle in a CFG with lookahead, and almost certainly what the user meant.
The current wiki syntax cannot be described in simple BNF, but it is not impossible to parse. The MediaWiki engine successfully converts it to HTML and, in the reverse direction, users who intend to accomplish a result in bold and italics are able to convert this intension into wiki syntax.
Slight disagreement on terms: The current syntax is convertible to HTML - it is not parseable. At the least, it is not currently parsed... No internal representation is generated, and the system just makes something on the close order of 50 regex passes per page to convert it into HTML.
Thus, it is also possible to write a program that converts the current Wikipedia dump into using <i> and <b> rather than apostrophes, and then back again to traditional wiki syntax. Since <i> and <b> are already supported, you could make it a policy on your own wiki whether apostrophes should be deprecated. To enforce such a policy, every stored article can be converted. Such a policy is not very likely on Wikipedia, though.
Possible - though difficult. I'd actually welcome someone creating a program to convert MediaWiki's syntax from apostrophes to <i> and <b>, as that could technically provide a more formal specification of how the MediaWiki parser handles apostrophes. However, at the moment, the only such program we have is MediaWiki itself.
- Eric