Andrew Rodland wrote:
It's nice in the human-usability sense that you can say exactly what you mean,
You can. You *can* use <nowiki> or ' or any other kludge you like. The point is that you shouldn't have to do this when what you're trying to do is something that occurs frequently in half of all articles!
How does "a side-effect of the way regular expressions match text" turn the markup for bold into an apostrophe and the markup for italic?
If you had read my message, you might have noticed that I've been trying to tell you that ''' needn't always be "bold" because it's frequently not what the user meant (especially when there's no matching close-bold).
The second assumption you are making (explicitly, even) is that it is more difficult to implement, when in fact you really just mean that you found it harder because it is not the way regular expressions normally work (and because you find the behaviour confusing because you don't normally think of French). I didn't find this particularly difficult to do -- neither in the current parser, nor in flexbisonparse.
If you had read my messages, you might have noticed that my reasoning was based neither on anything to do with regexes at all, nor on linguistic prejudice, but on a simple consideration. It is impossible, at the time that the parser sees a ''', to resolve what type of token it is, without looking ahead to the end of the line (an unbounded and unknown distance away).
Right, so *this* is what you're on about. I'm afraid there's another false assumption you're making, namely inefficiency or otherwise inherent evil of what you call "look-ahead". You're forgetting that the same applies to [[ and {{ and {| and || and a whole host of other things. *This is not a problem.* Neither for the current parser (because it passes the entire text several times anyway) nor for a proper parser like flexbisonparse (simply because of the way LALR parsers work). If you want me to elaborate on the latter, please feel free to ask and I'll explain.
The alternative is that '' means '', and ''' means '''.
And indeed, they do.
The existing code in doQuotes() simply operates by logically _separating_ the consecutive quotes, so automatic conversion wouldn't be overly taxing, nor time-critical.
... nor required.
And it's still, I think, a violation of expectations.
What you're saying essentially amounts to saying that this:
On trouve l'''homme'' sur la Terre.
should be rendered as:
On trouve l<b>homme<i> sur la Terre.</i></b>
even though that is clearly not what the user meant.
Timwi