On Mon, Nov 12, 2007 at 10:59:49AM +1100, Steve Bennett wrote:
On 11/12/07, Jay R. Ashworth jra@baylink.com wrote:
Could you each please post your personal favorite hobby-horse counter case which you feel would make parsing these constructs difficult so we can all pick it apart?
IMHO it's utter madness to replace one set of ambigous, difficult-to-parse formatting markers with a different set of ambiguous, difficult-to-parse formatting markers.
"Jet* is an Australian airline. Jet* was founded in 2000 as a spin-off of Qantas."
The common solution to Tradenames with silly spelling or rendering is to do your best once, and then ignore them for the rest of the article, IME.
"The biggest problems are bold and/or italics, and parsing and/or regular expressions".
"files are most often written to the /etc or /bin directories"
"The sounds /z/ and /s/ are distinct phonemes in English, but allophones in Spanish."
"Quasi-emoticons like *smiles* and *hugs* are used to..."
Ok; you've convinced me: the singletons are too ambiguous. :-)
I think even if you can make rules that can tell them apart, you're adding unnecessary complexity both to the parser and to the user. Compare:
- All text between ** and ** is shown in bold.
- All text between * and * is shown in bold. Except if the first * is at
the start of the line, in which case it's a list. Or if the * is in the middle of a word, in which case it's shown literally. Of course, if you actually do want bold in the middle of a word, do X...
Actually there's a flaw here: ** at the start of a line is going to be ambiguous as well. Bugger.
And 2**5 (exponentiation() is a potential problem as well, yes.
Any in-band approach will have this problem; the trick is to choose a token that reduces it to an acceptable level -- where by "acceptable" I mean "causes fewer problems in the Real World than What We Have Now".
:-)
Strangely enough, with my 2 line hack to parser.php, the current text renders exactly correctly:
**Melbourne** is a great city.
**This is a list.
Well, an unadorned second level list item renders poorly just now anyway, right?
So again: is "turning bold and italics off between two alphanumeric characters" a thing which actually *happens*, much?
Dunno. My parents' company used to be spelt with the first part of the word in bold and the second part in italics, no space in between. There are bound to be a few techie companies spelt like that.
That's not "spelling". That's "rendering", and a policy decision has to be taken as to how much of that is required to be representable.
Cheers, -- jra