On Thu, Aug 17, 2006 at 05:12:22PM +0200, Steve Bennett wrote:
A URL-like
thing that was typed without any particular surrounding
syntax (it gets autolinked). Similar lookahead would presumably be
necessary for RFCs, ISBNs, and PMIDs (okay, that's enough to convince
me to agree that they should be ditched :) ). In general, a lookahead
of no more than one character is considered desirable.
What can I say, I don't like these "freelinks". They just don't seem
clean. Normal text which spontaneously turns into a link without any
special punctuation or anything. Hmm.
Parsers don't have to be single pass... and ours isn't now.
Is it?
He *seems* to
be saying that you'd have to make special rules for each
allowed HTML tag, and presumably each allowed attribute and property
thereof, and maybe even every combination of them (!). Would there be
any advantage in leaving those out of the grammar and keeping Parser
and Sanitizer separate as they are now?
I don't get why we even allow HTML tags, other than convenience. It's
not like the final output of the encyclopaedia is guaranteed to bear
any resemblance to a web page...
For instance, why do we support <b>? We have '''... It's just not
clean. (I dare someone to reply that ''' is semantic markup...heh.)
It is; I believe it renders as <strong>, not <bold>.
Cheers,
-- jra
--
Jay R. Ashworth jra(a)baylink.com
Designer Baylink RFC 2100
Ashworth & Associates The Things I Think '87 e24
St Petersburg FL USA
http://baylink.pitas.com +1 727 647 1274
The Internet: We paved paradise, and put up a snarking lot.