lcrocker@nupedia.com wrote:
I vote for punctuation cropping, too.
That seems to be a growing consensus. The remaining questions, then, are the exact details. Exactly which characters do we want to consider "punctuation", and under what circumstances? My suggestion is this: After parsing URLs the way it does now, if the URL ends with one, and exactly one, of the characters period, comma, question mark, or exclamation; then remove that character and assume it punctuates the sentence. Otherwise, leave the URL alone.
I agree, but I'd also add semicolon and colon characters to that list.
Neil
.