On 11-08-09 05:03 PM, John Elliot wrote:
On 10/08/2011 9:49 AM, Daniel Friesen wrote:
WikiText is loose so instead of errors, if the
parser doesn't like something you inputted it's not going to pass that
through raw and let a html validator say it's wrong, it's going to
decide it doesn't like it and treat it as plaintext.
Well, the validation
feature that I added to my web-site helped me catch
a bug for you.
If you are outputting WikiText that includes the HTML-like <h1>, <h2>,
etc., tags, then make sure you're not outputting them in the context of
table content, because that is invalid. In order to turn such WikiText
into compliant HTML, the <h1> WikiText should be converted to a <span
class="h1"> HTML element, and so forth. The various skins should be
updated to do something sensible with the h* classes.
<h#> tags are not
invalid inside of table contents. <tr>'s contents are
flow content, and <h#> tags are flow content.
<h#> tags are however invalid inside of <th> tags which are phrasing
content. However in that context the correct thing would not necessarily
be to turn the h# into a span, but fold it into the header that's
already there.
Which may or may not be what the user wants. Both of those changes can
break a user's site styles.
Would you like to argue for a $wgStricterParsing bool that will
sacrifice parser output consistency for things like folding == headers
into parent th's (perhaps turn into a span if they explicitly use a <h#>
instead of ==), and other things we haven't been able to do to the
parser for compat reasons?
I'll let you know if my HTML validator helps me to
easily catch any
other bugs like this for you.
We've already established that MediaWiki is broken because it's
outputting empty <ul> elements, so maybe you can have a look at fixing
that up too.
That was a HTML4/XHTML1 rule that's been removed. An empty
<ul></ul> is
valid HTML5.
Wikipedia is just currently set to output an XHTML DOCTYPE and
well-formed XML output because of some bots that still use
screen-scraping content that were given a second chance to have their
developers fix them to use the api before HTML5 is turned on permanently.
Thanks.
John.
--
~Daniel Friesen (Dantman, Nadir-Seen-Fire) [
http://daniel.friesen.name]