-----BEGIN PGP SIGNED MESSAGE-----
On Thu, Jul 31, 2008 at 3:48 PM, Brion Vibber
HTML 4 defines the contents of those elements as
CDATA in the DTD, just
like <br> and <img> are defined as having no content so there's no
ambiguity when they're being interpreted by an HTML parser.
XHTML doesn't provide for that sort of declaration, since XML requires
you to be able to parse a document without having a DTD ahead of time.
For compatibility of documents between both HTML and XHTML parsers,
XHTML 1.0 recommends using linked resources if possible -- so there's no
worry about how to escape contents -- or else using explicit
<![CDATA[...]]> sections in your <script> and <style> elements.
So in fact, a compliant HTML parser *would* parse the contents of
<script> or <style> incorrectly, if it contained entities that were
expected to be decoded?
Right... I just did a quick test confirm. This file:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
will display "&" when served as text/html and "&" when
In that case my fix is wrong, and we should
write up a Sanitizer::escapeCdata() and use that here (and elsewhere).
Icky... but perhaps no good way around it I guess. :)
The tricky bit is that for HTML mode you want to wrap the "<![CDATA["
and "]]>" bits in comments (/* blah */) so they don't interfere with the
JS or CSS code.
Does that necessarily generally work? Bah, this shouldn't be so
- -- brion
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.8 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
-----END PGP SIGNATURE-----