(Richard, you may be interested in the wikitech-l development list; see http://www.wikipedia.org/mailman/listinfo/wikitech-l )
On lun, 2002-12-30 at 11:02, Richard Grevers wrote:
My apologies if this has been discussed before, but I just noticed that Wikipedia pages carry a doctype declaration of
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
Since this does not include the URL to a full DTD, the curse of doctype sniffing sees most browsers render the page in "Quirks" or "Bugwards compatible" mode rather than "strict" mode.
Yes, this is intentional.
In the interests of making wikipedia accessible, I think it would be better to force strict mode - then editors would be more likely to notice markup errors which affect the rendering on browsers which do not correct for bad markup.
We don't use the strict DTD (or, better yet, XHTML) because our hacked-together wikicode->HTML parser currently can't guarantee that it will generate well-formed output (particularly if there is raw HTML in the page, which is munged a bit but not always correctly). If well-behaved web browsers reject the page or massively break page rendering due to a minor error, it's going to be mighty difficult for editors using them to click 'edit' and try to work around the problem!
(On a similar note, nested HTML tables without closing </td> and </tr> tags, while acceptable under HTML 4 standards, break Netscape 4.x, even to the point of *crashing* it outright. This cropped up a while ago on some of the Canadian provinces articles; the person who noticed the problem couldn't edit the pages to look for the problem, since the browser crashed before an edit link was made available. Really strict DTD parsing isn't quite as bad -- it should at least tell you the problem! -- but makes it hard to continue browsing.)
Volunteers for a new XHTML-safe parser are welcome...
-- brion vibber (brion @ pobox.com)