Hello,
Last I heard, many Wikimedia and other MediaWiki installations relied
on HTML Tidy to some degree to cope with bad markup, like templates may
generate. There is also some interest in using "HTML5" features in
MediaWiki wikis, but combining the two is difficult as Tidy will choke
on elements it does not know (since it does not know how to parse them)
and configuring them with the new-*-tags options is troublesome.
I made
http://lists.w3.org/Archives/Public/www-archive/2011Nov/0006.html
some patches to address that problem (and only that problem really, Tidy
is not going to be a proper "HTML5" parser that parses documents exactly
as a browser would) so that Tidy keeps working with old markup as usual,
but doesn't choke on new elements, doesn't complain about new attributes
and doesn't "break" conforming documents. Where new markup is malformed,
Tidy is unlikely to repair it as sensibly as it does for old markup, and
it may in fact make them even more malformed or otherwise break them. I
do not care about that at this point, and there is nobody else contribu-
ting code to the project of late.
I am interested in polishing my patch and including this in the official
repository, but I need people I can blame instead of myself when it does
not work right, err, I mean, testers. I haven't heard from anyone with a
MediaWiki background that they actually had some problem with Tidy plus
"HTML5", but if that is just because deployment is slow, this is likely
the best place to look for takers.
Discussion is taking place on tidy-develop(a)lists.sourceforge.net but you
can also use the bug tracker on sourceforge or mail me directly, ideally
sending a copy to www-archive(a)w3.org so there is a publically archived
copy I can refer others to if they would like to help out. Feedback on
other things than the patch or "HTML5" support in general is welcome too
of course.
regards,
--
Björn Höhrmann · mailto:bjoern@hoehrmann.de ·
http://bjoern.hoehrmann.de
Am Badedeich 7 · Telefon: +49(0)160/4415681 ·
http://www.bjoernsworld.de
25899 Dagebüll · PGP Pub. KeyID: 0xA4357E78 ·
http://www.websitedev.de/