On 11/28/07, Thomas Dalton <thomas.dalton(a)gmail.com> wrote:
Would it help to change the definition of "single
letter word" to not
include a single apostrophe? It would just be a small change to one
line of code and can be made straight away (rather than waiting for
the new parser), as far as I can tell, and would fix at least one of
the problems. Someone still needs to be done about the feature in the
long term, but in the short term, I see no reason for an apostrophe
counting as a word.
It needs to go slightly further than that. In the following sentence:
Blah four'''' then three''' then
'''''five
The four will still be split, even though it's not next to a single
letter word. So the change should be from:
- split the first bold after a single letter word, else after a
multi-letter word, else anywhere
to:
- split the first bold after a single letter word that is not
apostrophe, else after a multi-letter word that does not end in
apostrophe, else anywhere that is not immediately after an apostrophe.
Basically, any bold that is immediately after an apostrophe could only
arise from it already being split once, and splitting it *twice* is
definitely not helping anyone.
Hmm, actually I tried this, and it's not really a great improvement. The line:
* Take ''''four''' apostrophes and then throw
'''''five unclosed
apostrophes at them.
Previously rendered as something like: Take ''<i>four<b>...
Now you get: Take '<b>four' <i>apostrophes...
But at least '''' never renders as two apostrophes, I guess. Here's
the code change:
if ($x1 == ' ') {
if ($firstspace == -1) $firstspace = $i;
} else if ($x2 == ' ' && $x1 != "'") {
if ($firstsingleletterword == -1) $firstsingleletterword = $i;
} else if ($x1 != "'") {
if ($firstmultiletterword == -1) $firstmultiletterword = $i;
} else { /* Bold already split from '''', don't split again. */ }
Steve