Steve Bennett wrote:
Just to check: Is changing MediaWiki sytnax absolutely out of the question? Using // and ** for italic and bold respectively would solve that problem, be more consistent and more intuitive, and probably not be excessively difficult to phase in, right?
Many would say that any such discussion is a dead end. However, I think that is a bit narrow-minded. A syntax change on Wikipedia might not be very likely, but you can of course change MediaWiki for use on your own wiki website.
Changing to // and ** doesn't necessarily make the wiki syntax any more BNF parsable than today, does it? You can still write ***** and what's that supposed to mean? If clarity is wanted, the best would probably be <i> and <b>.
The current wiki syntax cannot be described in simple BNF, but it is not impossible to parse. The MediaWiki engine successfully converts it to HTML and, in the reverse direction, users who intend to accomplish a result in bold and italics are able to convert this intension into wiki syntax.
Thus, it is also possible to write a program that converts the current Wikipedia dump into using <i> and <b> rather than apostrophes, and then back again to traditional wiki syntax. Since <i> and <b> are already supported, you could make it a policy on your own wiki whether apostrophes should be deprecated. To enforce such a policy, every stored article can be converted. Such a policy is not very likely on Wikipedia, though.
What you can do is to run some experiments on the existing dump. How many cases are there where ''''' is hard to resolve? Did anybody count?
It is of course possible to write articles with unbalanced apostrophes. If I write '''hey'' it will render as '<i>hey</i>, and that's also how a conversion program should leave it. How many such user mistakes are there in the current dump? Perhaps somebody is already running a robot to find and fix such errors?