Jay R. Ashworth wrote:
On Wed, Nov 14, 2007 at 11:33:39AM -0500, Brion Vibber wrote:
Note also that it *is* a requirement to have sane behavior with this sort of construction:
L'''idée'' <- apostrophe followed by italics L''''idée''' <- apostrophe followed by bold
That's a *requirement* to continue to properly handle French and Italian text. The current apostrophe pass handler uses I believe a lookahead and then goes backwards, which is a fairly sane way of doing this. If EBNF can't handle it, then forget EBNF.
Can someone tell me why bold and italics are considered *part of the spelling of the word* (which seems to be what you're implying here)?
I think you'll find what I'm referring to is the fact that the apostrophe is used in languages such as French and Italian to elide the vowels and space between a definite article and a following substantive.
An example:
L'idée ("The idea")
Further, it's frequent for formatting on the substantive to *not* apply to its preceding article.
This means that if we want "idée" italicized because it's important, or a title, or a ship name, or whatever; we'd format it in HTML something like this:
L'<i>idée</i>
When using the double-apostrophe italics markup (inherited from Ward Cunningham's WikiWikiWeb via UseModWiki), this leads to a need to handle markup that looks like this:
L'''idée''
Perhaps Ward wouldn't have picked this syntax if his original wiki were in French or Italian, since this case doesn't come up as often in English (though it can, with contractions and possessives), but he did pick that and we ended up with it, and over the years we've tweaked the implementation to handle these sorts of common cases pretty well.
So, we have a markup, and we have an implementation which uses a fairly straightforward back-facing search to produce behavior which handles the important common cases the way we want.
I strongly doubt that it's impossible to make a specification of that algorithm.
-- brion vibber (brion @ wikimedia.org)