On Thu, Nov 15, 2007 at 03:56:21PM +1100, Steve Bennett wrote:
On 11/15/07, Jay R. Ashworth <jra(a)baylink.com>
wrote:
L'''idée'' <- apostrophe
followed by italics
L''''idée''' <- apostrophe followed by bold
That's a *requirement* to continue to properly handle French and Italian
text. The current apostrophe pass handler uses I believe a lookahead and
then goes backwards, which is a fairly sane way of doing this. If EBNF
can't handle it, then forget EBNF.
Can someone tell me why bold and italics are considered *part of the
spelling of the word* (which seems to be what you're implying here)?
I've never seen that to be the case in any character-based natural
language.
I think it's more that L''''idee'' is commonly used
idiom. It's not part of
the "spelling of the word", whatever that means.
If it's a *requirement* that we be able to produce a certain text
rendering of a word, then it is no longer merely a rendering, it's part
of the spelling of the word -- sometihing without which it's not the
same word.
Similarly, it might be worth investigating exactly
what mid-word
multi-apostrophic constructs are used (yes, Jay, like you suggested...). In
French, d'* and l'* are used, and I guess an arbitrary number of others with
diminishing likelihood: qu'*, jusqu'*, s'*, and even m'*, t'*, etc.
I hate the parser's (doQuotes()) current approach of trying to second-guess
what the user wants: we should be dictating the grammar, and either they are
using a rule we specify, or they aren't. I don't really care how complicated
the rules get, but we should be able to define them, stick them on a wall,
and tell people: if you're not using one of these rules, you're going to get
garbage.
Well, it will be interesting to see how that plays in Peoria, yes. :-)
Cheers,
-- jra
--
Jay R. Ashworth Baylink jra(a)baylink.com
Designer The Things I Think RFC 2100
Ashworth & Associates
http://baylink.pitas.com '87 e24
St Petersburg FL USA
http://photo.imageinc.us +1 727 647 1274