Andrew Rodland wrote:
It's nice in the human-usability sense that you can say exactly what
you mean,
You can. You *can* use <nowiki> or ' or any other kludge you like.
The point is that you shouldn't have to do this when what you're trying
to do is something that occurs frequently in half of all articles!
How does "a side-effect of the way regular
expressions match text"
turn the markup for bold into an apostrophe and the markup for
italic?
If you had read my message, you might have noticed that I've been trying
to tell you that ''' needn't always be "bold" because it's
frequently
not what the user meant (especially when there's no matching close-bold).
The second
assumption you are making (explicitly, even) is that it
is more difficult to implement, when in fact you really just mean
that you found it harder because it is not the way regular
expressions normally work (and because you find the behaviour
confusing because you don't normally think of French). I didn't
find this particularly difficult to do -- neither in the current
parser, nor in flexbisonparse.
If you had read my messages, you might have noticed that my reasoning
was based neither on anything to do with regexes at all, nor on
linguistic prejudice, but on a simple consideration. It is
impossible, at the time that the parser sees a ''', to resolve what
type of token it is, without looking ahead to the end of the line (an
unbounded and unknown distance away).
Right, so *this* is what you're on about. I'm afraid there's another
false assumption you're making, namely inefficiency or otherwise
inherent evil of what you call "look-ahead". You're forgetting that the
same applies to [[ and {{ and {| and || and a whole host of other
things. *This is not a problem.* Neither for the current parser (because
it passes the entire text several times anyway) nor for a proper parser
like flexbisonparse (simply because of the way LALR parsers work). If
you want me to elaborate on the latter, please feel free to ask and I'll
explain.
The alternative is that '' means '',
and ''' means '''.
And indeed, they do.
The existing code in doQuotes() simply operates by
logically
_separating_ the consecutive quotes, so automatic conversion wouldn't
be overly taxing, nor time-critical.
... nor required.
And it's still, I think, a violation of
expectations.
What you're saying essentially amounts to saying that this:
On trouve l'''homme'' sur la Terre.
should be rendered as:
On trouve l<b>homme<i> sur la Terre.</i></b>
even though that is clearly not what the user meant.
Timwi