HTML 5 is the up-and-coming version of the HTML standard, which supports all sorts of new and exciting features. For those who don't know about it, here's some background:
Wikipedia article: http://en.wikipedia.org/wiki/HTML_5 Summary of major differences from HTML 4: http://www.w3.org/TR/html5-diff/ Full specification: http://dev.w3.org/html5/spec/Overview.html
It's clear at this point that HTML 5 will be the next version of HTML. It was obvious for a long time that XHTML was going nowhere, but now it's official: the XHTML working group has been disbanded and work on all non-HTML 5 variants of HTML has ceased. (Source: http://www.w3.org/2009/06/xhtml-faq.html) MediaWiki will have to switch to HTML 5 sooner or later. It's a great standard, and I think we would do well to be early on the curve here and help spark interest in and support for it.
HTML 5 is designed to be backward-compatible with legacy content, both on the authoring side and (especially) the implementation side. Well-written XHTML 1.0 should theoretically need only minor modifications to validate as HTML 5, and indeed this appears to be the case in practice. All that's required to get a typical page in Monobook validating as HTML 5 in the W3C's experimental validator is (*if* we disregard user-added markup):
* Change the doctype to "<!doctype html>". * Delete '<meta http-equiv="Content-Style-Type" content="text/css" />'. Which is a really stupid element anyway. :P * Delete name attributes from all <a> elements. They've been redundant to id for eternity, and every browser in the universe supports id; we can finally move these to the headers themselves. * Remove comments from inside <script> tags with a src attribute. I already did this in r52828, since they're pointless anyway.
(The W3C validator is at http://validator.w3.org/. You can override the doctype and set it to interpret Wikipedia URLs as HTML 5 under "More Options".)
Note that HTML 5 does follow in the "strict" vein of XHTML. Presentational elements and attributes such as font, border, cellpadding, etc. are all invalid in HTML 5. (Implementations must support them, but conforming documents must not use them. <b> and <i> remain valid.) There's very little of this stuff left in the HTML that ships with the software. We can remove this incrementally as it's reported.
For user-added content, I think it's fair to just treat it as GIGO -- if they submit invalid content that can't be easily converted to a valid form, it will be output as-is. Users can already submit invalid content in cases where we can't easily fix it, e.g., duplicate id's. If we switch to HTML 5, the W3C validator will begin outputting errors on this presentational stuff, which should hopefully encourage users to reduce it over time, at least in high-profile places like the front page or infoboxes.
So converting to HTML 5 would be trivial. However, in addition to lending our support to good standards, there are several modest practical benefits that would accrue from the switch. I include here only things that are possible in valid HTML 5 documents, but which would not validate as XHTML 1 (so excluding stuff like localStorage); and which are usable right now (so excluding stuff like <nav>, <input type=color>, etc.):
* HTML 5 permits omission of a lot of the cruft that XHTML requires. It permits leaving off ending tags in most cases where that's unambiguous, and leaving off some required tags entirely (such as <html>, <head>, and <body> if they have no attributes). The "/>" ending is no longer required. Superfluous attributes like type=text/javascript on <script> are no longer needed (unless you want to use <script type=application/x-python or something, of course!). Quotes may be omitted from attributes in almost all cases. The doctype is shorter and easy to remember, and there is no xmlns attribute. For an example of how compact valid HTML 5 can be, look at the source of http://aryeh.name/. I once did a crude test and found we could cut 5% or so off the length of our HTML by doing this -- *after* gzipping. Not only does this make our code smaller, it will also make it easier to read. * We could support <video>/<audio> on conformant user agents without the use of JavaScript. There's no reason we should need JS for Firefox 3.5, Chrome 3, etc. * We can use data-* attributes to store custom data for scripts. This came up in the case of the HTML diff work: the author of that stuck some data for scripts in custom attributes, which caused XHTML 1 validation to fail. * We can use HTML 5 form attributes. These will enhance the experience of users of appropriate browsers, and do nothing for others. At least Opera 9.6x already supports almost all HTML 5 form attributes. (Source: http://www.opera.com/docs/specs/presto211/forms/) We could, for instance, give required fields the "required" attribute, which will cause the browser to prevent the form submission and notify the user if they aren't filled in, without needing either JavaScript or a server-side check. The "pattern" attribute even allows requiring that the input match a regex, and this is also supported by Opera 9.6x. See http://dev.w3.org/html5/spec/Overview.html#common-input-element-attributes. * There are a couple of parser tests that currently fail because of misnested tags. If we altered the parser to no longer output any </p> tags (which HTML 5 permits), these tests would immediately pass. It doesn't look like anyone's going to fix them otherwise.
These are only a few of the things that have immediate concrete benefit. There are probably more I couldn't find immediately (HTML 5 is a huge spec), and of course in the long term there's an incredible amount that would be invaluable to us.
I propose the following migration plan:
1) Fix the doctype, Content-Style-Type, and name attributes. We can then officially claim we're shipping HTML 5! :) (Albeit maybe invalid in some cases.) Also remove any unnecessary attributes and elements, without breaking XML well-formedness. Begin using HTML 5 form attributes and any other useful features. Poke the Cortado people about letting <video> work without JavaScript.
2) Once this goes live, if no problems arise, try causing an XML well-formedness error. For instance, remove the quote marks around one attribute of an element that's included in every page. I suggest this as a separate step because I suspect there are some bot operators who are doing screen-scraping using XML libraries, so it would be a good idea to assess how feasible it is at the present time to stop being well-formed. In the long run, of course, those bot operators should switch to using the API. If we receive enough complaints once this goes live, we can revert it and continue to ship HTML 5 that's also well-formed XML, for the time being.
3) If XML well-formedness is not a problem, get rid of all unneeded closing tags, quotation marks, self-closing "/>" constructs, etc. Create an Html class like Xml, which will generate elements in the nice compact form that HTML 5 permits, and phase out use of Xml in favor of Html. (Xml has long since ceased to be purely about XML anyway.)
So, what are people's thoughts?
On Tue, Jul 7, 2009 at 1:54 AM, Aryeh GregorSimetrical+wikilist@gmail.com wrote: [snip]
- We could support <video>/<audio> on conformant user agents without
the use of JavaScript. There's no reason we should need JS for Firefox 3.5, Chrome 3, etc.
Of course, that could be done without switching the rest of the site to HTML5...
Although I'm not sure that giving the actual video tags is desirable. It's a tradeoff:
Work for those users when JS is enabled and correctly handle saving the full page including the videos vs take more traffic from clients doing range requests to generate the poster image, and potentially traffic from clients which decide to go ahead and fetch the whole video regardless of the user asking for it.
There is also still a bug in FF3.5 that where the built-in video controls do not work when JS is fully disabled. (Because the controls are written in JS themselves)
(To be clear to other people reading this the mediawiki ogghandler extension already uses HTML5 and works fine with Firefox 3.5, etc. But this only works if you have javascript enabled. The site could instead embed the video elements directly, and only use JS to substitute the video tag for fallbacks when it detects that the video tag can't be used)
On Tue, Jul 7, 2009 at 2:07 AM, Gregory Maxwellgmaxwell@gmail.com wrote:
Of course, that could be done without switching the rest of the site to HTML5...
Well, not without breaking XHTML validity, and in that case what's the point of sticking with XHTML? I don't think we'll be serving an HTML 5 doctype for pages with <video>, and an XHTML 1 doctype otherwise.
take more traffic from clients doing range requests to generate the poster image
You should be able to use the "poster" attribute. Firefox doesn't support this until 3.6, though https://bugzilla.mozilla.org/show_bug.cgi?id=449156. (I *think* WebKit already supports it, at least on trunk, based on some quick searches of their Bugzilla; not sure since when.) For Firefox 3.5, you could add the poster image with JavaScript, which is still strictly better than the current situation. Probably it would be possible to provide the poster image using some simple CSS hacks, too.
I don't know what range requests you're referring to?
and potentially traffic from clients which decide to go ahead and fetch the whole video regardless of the user asking for it.
This is supposed to be controlled by the autobuffer attribute. Are you aware that any user agents will buffer the whole video even if this attribute isn't present? It would probably be sensible to have autobuffer set on the file page itself, but not when it's included in articles. (I'd hope browsers wouldn't request the *whole* thing even if autobuffer is set, only enough to be able to reliably play to the end if the user hits play.)
Of course, part of the whole point of the <video> element is that content authors are giving up control over implementation details to browser authors. If Firefox buffers video too aggressively . . . file a bug with them and fix it for everyone! :) A bare <video> tag will mean the user doesn't have to interact with site-specific custom apps -- which is the idea. Let them use the same native browser interface that they'll (hopefully) become accustomed to from all the other sites using <video>.
There is also still a bug in FF3.5 that where the built-in video controls do not work when JS is fully disabled. (Because the controls are written in JS themselves)
Well, that's unfortunate, but it hopefully won't affect 3.6. Surely it won't affect Chrome, or Safari with the appropriate codec installed. Even at worst, it won't be noticeably inferior to the current situation for these users, and there are other benefits (no need to load Cortado at all, no custom interface).
On Tue, Jul 7, 2009 at 2:31 AM, Aryeh GregorSimetrical+wikilist@gmail.com wrote: [snip]
You should be able to use the "poster" attribute. Firefox doesn't support this until 3.6, though https://bugzilla.mozilla.org/show_bug.cgi?id=449156. (I *think* WebKit already supports it, at least on trunk, based on some quick searches of their Bugzilla; not sure since when.) For Firefox 3.5, you could add the poster image with JavaScript, which is still strictly better than the current situation. Probably it would be possible to provide the poster image using some simple CSS hacks, too.
What do you think we're doing now? A jpeg 'poster' is displayed. When the user clicks the poster is replaced by the appropriate playback mechanism.
This is supposed to be controlled by the autobuffer attribute. Are you aware that any user agents will buffer the whole video even if this attribute isn't present?
Firefox betas, for example. :) There is a <video/> support QT based browser that just fetches as soon as possible (I can't think of the name at the moment). I expect we'll see more of it in the future. Probably not enough to matter.
I said it needed to be weighed, not that the weighing would come out any particular way. I'm a fan of using Video natively. The fact that it makes save-page work the way it should is really great.
It would probably be sensible to have autobuffer set on the file page itself, but not when it's included in articles.
Thats a good idea, and another compelling argument for using the video tag directly rather than a last minute substitution.
[snip]
installed. Even at worst, it won't be noticeably inferior to the current situation for these users, and there are other benefits (no need to load Cortado at all, no custom interface).
I'm not sure how you think it currently works but there is currently zero need to load cortado for HTML5 supporting browsers.
Gregory Maxwell wrote:
On Tue, Jul 7, 2009 at 2:31 AM, Aryeh GregorSimetrical+wikilist@gmail.com wrote: [snip]
installed. Even at worst, it won't be noticeably inferior to the current situation for these users, and there are other benefits (no need to load Cortado at all, no custom interface).
I'm not sure how you think it currently works but there is currently zero need to load cortado for HTML5 supporting browsers.
Unless they don't have Ogg support. :)
*cough Safari cough*
But if they do, yes; our JS won't bother bringing up the Java applet if it's got native support available.
-- brion
On Tue, Jul 7, 2009 at 4:23 PM, Brion Vibberbrion@wikimedia.org wrote:
Unless they don't have Ogg support. :)
*cough Safari cough*
But if they do, yes; our JS won't bother bringing up the Java applet if it's got native support available.
It would be a four or five line patch to make OggHandler nag Safari 3/4 users to install XiphQT and give them the link to a download page. The spot for the nag is already stubbed out in the code. Just say the word.
I think if the playback system is java in ~any browser~ we should ~softly~ "inform" people to get a browser with native support if they want a high quality video playback experience.
The cortado applet is awesome ... but startup time of the java vm is painful compared to other user experiences with video.. not to mention seeking, buffering, and general interface responsiveness in comparison to the native support.
--michael
Gregory Maxwell wrote:
On Tue, Jul 7, 2009 at 4:23 PM, Brion Vibberbrion@wikimedia.org wrote:
Unless they don't have Ogg support. :)
*cough Safari cough*
But if they do, yes; our JS won't bother bringing up the Java applet if it's got native support available.
It would be a four or five line patch to make OggHandler nag Safari 3/4 users to install XiphQT and give them the link to a download page. The spot for the nag is already stubbed out in the code. Just say the word.
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Michael Dale wrote:
I think if the playback system is java in ~any browser~ we should ~softly~ "inform" people to get a browser with native support if they want a high quality video playback experience.
The cortado applet is awesome ... but startup time of the java vm is painful compared to other user experiences with video.. not to mention seeking, buffering, and general interface responsiveness in comparison to the native support.
*nod*
We don't want to annoy users, but subtle nudges to a better experience can be good. :)
(It'd be good to avoid the "This site best viewed in Netscape Gold" sort of browser fanboy wars of the '90s, though. ;)
-- brion
2009/7/7 Brion Vibber brion@wikimedia.org:
Michael Dale wrote:
I think if the playback system is java in ~any browser~ we should ~softly~ "inform" people to get a browser with native support if they want a high quality video playback experience. The cortado applet is awesome ... but startup time of the java vm is painful compared to other user experiences with video.. not to mention seeking, buffering, and general interface responsiveness in comparison to the native support.
*nod* We don't want to annoy users, but subtle nudges to a better experience can be good. :) (It'd be good to avoid the "This site best viewed in Netscape Gold" sort of browser fanboy wars of the '90s, though. ;)
I know we can't do it, but I do have subtle dreams of "Sorry, this video won't display in Safari because Apple refuse to. If you don't want to use a better browser, here's Apple's phone number."
- d.
On Wed, 2009-07-08 at 00:59 +0100, David Gerard wrote:
2009/7/7 Brion Vibber brion@wikimedia.org:
We don't want to annoy users, but subtle nudges to a better experience can be good. :) (It'd be good to avoid the "This site best viewed in Netscape Gold" sort of browser fanboy wars of the '90s, though. ;)
I know we can't do it, but I do have subtle dreams of "Sorry, this video won't display in Safari because Apple refuse to. If you don't want to use a better browser, here's Apple's phone number."
- d.
I'm not sure I'd call that "subtle"... "Effective" maybe :D
A little "Did you know Firefox 3.5 can show this to you better, get it here" notice would be great. Take care to avoid browser fanboy wars yes, but it's be nice to nudge gently. Probably text-only and dismissible would be fine.
-Mike
On Wed, Jul 8, 2009 at 10:43 AM, Mike.lifeguardmikelifeguard@fastmail.fm wrote:
On Wed, 2009-07-08 at 00:59 +0100, David Gerard wrote:
2009/7/7 Brion Vibber brion@wikimedia.org:
We don't want to annoy users, but subtle nudges to a better experience can be good. :) (It'd be good to avoid the "This site best viewed in Netscape Gold" sort of browser fanboy wars of the '90s, though. ;)
I know we can't do it, but I do have subtle dreams of "Sorry, this video won't display in Safari because Apple refuse to. If you don't want to use a better browser, here's Apple's phone number."
- d.
I'm not sure I'd call that "subtle"... "Effective" maybe :D
A little "Did you know Firefox 3.5 can show this to you better, get it here" notice would be great. Take care to avoid browser fanboy wars yes, but it's be nice to nudge gently. Probably text-only and dismissible would be fine.
-Mike _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
To further this point, I think it's very crucial that we don't phrase it as being about the browser as a whole, but maybe keep it related to what you're currently doing.
"Did you know that ABC can give you a richer experience when editing?" "Did you know that XYZ can show you this video faster, and with less skipping?"
That sort of thing is bound to be much more effective than "Switch to Firefox/Chrome/etc today!" dismissable or not.
-Chad
The current language is "For best video playback experience we recommend _Firefox 3.5_" ... but I am open to adjustments.
--michael
Chad wrote:
To further this point, I think it's very crucial that we don't phrase it as being about the browser as a whole, but maybe keep it related to what you're currently doing.
"Did you know that ABC can give you a richer experience when editing?" "Did you know that XYZ can show you this video faster, and with less skipping?"
That sort of thing is bound to be much more effective than "Switch to Firefox/Chrome/etc today!" dismissable or not.
-Chad
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
On Wed, Jul 8, 2009 at 2:23 PM, Michael Dalemdale@wikimedia.org wrote:
The current language is "For best video playback experience we recommend _Firefox 3.5_" ... but I am open to adjustments.
I'd drop the word experience. It's superfluous marketing speak.
So the notice chain I'm planning on adding to the simple <video/> compatibility JS is something like this:
If the user is using safari4 on a desktop system and doesn't have xiphqt: * Advise the user to install XiphQT (note, there should be a good installer available soon)
The rational being that if they are known to use safari now they probably will in the future, better to get them to install XiphQT than to hope they'll continue using another browser.
If the users is using any of a list of platforms known to support firefox: * Advise them to use firefox 3.5
Otherwise say nothing. It would be silly at this time to be advising users of some non-firefox-supporting mobile device that firefox 3.5 provides the best experience. ;)
Also should be noted a simple patch for oggHandler to output <video> and use the mv_embed library is in the works see: https://bugzilla.wikimedia.org/show_bug.cgi?id=18869
you can see it in action a few places like http://metavid.org/wiki/File:FolgersCoffe_512kb.1496.ogv
Also note my ~soft~ push for native support if you don't already native support. (per our short discussion earlier in this thread) if you say "don't show again" it sets a cookie and won't show it again.
I would be happy to randomly link to other browsers that support html5 video tag with ogg as they ship with that functionality.
I don't really have apple machine handy to test quality of user experience in OSX safari with xiph-qt. But if that is on-par with Firefox native support we should probably link to the component install instructions for safari users.
--michael
Gregory Maxwell wrote:
On Tue, Jul 7, 2009 at 1:54 AM, Aryeh GregorSimetrical+wikilist@gmail.com wrote: [snip]
- We could support <video>/<audio> on conformant user agents without
the use of JavaScript. There's no reason we should need JS for Firefox 3.5, Chrome 3, etc.
Of course, that could be done without switching the rest of the site to HTML5...
Although I'm not sure that giving the actual video tags is desirable. It's a tradeoff:
Work for those users when JS is enabled and correctly handle saving the full page including the videos vs take more traffic from clients doing range requests to generate the poster image, and potentially traffic from clients which decide to go ahead and fetch the whole video regardless of the user asking for it.
There is also still a bug in FF3.5 that where the built-in video controls do not work when JS is fully disabled. (Because the controls are written in JS themselves)
(To be clear to other people reading this the mediawiki ogghandler extension already uses HTML5 and works fine with Firefox 3.5, etc. But this only works if you have javascript enabled. The site could instead embed the video elements directly, and only use JS to substitute the video tag for fallbacks when it detects that the video tag can't be used)
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
On Tue, Jul 7, 2009 at 7:53 PM, Michael Dalemdale@wikimedia.org wrote: [snip]
I don't really have apple machine handy to test quality of user experience in OSX safari with xiph-qt. But if that is on-par with Firefox native support we should probably link to the component install instructions for safari users.
I believe it's quite good. Believe is the best I can offer never having personally tested it. I did work with a safari user sending them specific test cases designed to torture it hard (and some XiphQT bugs were fixed in the process) and at this point it sounds pretty good.
What I have not stressed is any of the JS API. I know it seeks, I have no clue how well, etc.
There is also an apple webkit developer who is friendly and helpful at getting things fixed whom we work with if we do encounter bugs... but more testing is really needed.
Safari users wanted.
As far as the 'soft push' ... I'm generally not a big fan of one-shot completely dismissible nags: Too often I click past something only to realize shortly thereafter that I really should have clicked on it. I'd prefer something that did a significant (alert-level) nag *once* but perpetually included a polite "Upgrade your Video" button below (above?) the fallback video window.
There is only a short period of time remaining where a singular browser recommendation can be done fairly and neutrally. Chrome and Opera will ship production versions and then there will be options. Choices are bad for usability.
On Wed, Jul 8, 2009 at 3:46 AM, Gregory Maxwell gmaxwell@gmail.com wrote:
There is only a short period of time remaining where a singular browser recommendation can be done fairly and neutrally. Chrome and Opera will ship production versions and then there will be options. Choices are bad for usability.
We should not recommend Chrome - as good as it is, but it has serious privacy problems. Opera is not Open Source, so I think we'd best stay with Firefox, even if Chrome/Opera begin to support video tag.
Marco
On Wed, Jul 8, 2009 at 4:43 PM, Marco Schustermarco@harddisk.is-a-geek.org wrote:
We should not recommend Chrome - as good as it is, but it has serious privacy problems.
Out of curiosity, why do we need to "recommend" a browser at all, and why do we think anyone will listen to our "recommendation"? People use the browser they use. If the site they want to go to doesn't work in their browser, they'll either not go there, or possibly try another one. They're certainly not going to change browsers just because the site told them to.
Personally, I use Chrome, FF and IE. And the main reason for switching is just to have different sets of cookies. Occasionally a site doesn't like Chrome, so I switch. But it's not like I'm going to take a "your experience would be better in <browser>" statement seriously.
Steve
We need to inform people that the quality of experience can be substantially improved if they use a browser that supports free formats. Wikimedia only distributes content in free formats because if you have to pay for a licensee to view, edit or publish ~free content~ then the content is not really ~free~.
We have requested that Apple and IE support free formats but they have chosen not to. Therefore we are in a position where we have to recommend a browser that does have a high quality user experience in supporting the formats. We are still making every effort to display the formats in IE & Safari using java or plugins but we should inform people they can have an improved experience on par with proprietary solutions if they are using different browser.
--michael
Steve Bennett wrote:
On Wed, Jul 8, 2009 at 4:43 PM, Marco Schustermarco@harddisk.is-a-geek.org wrote:
We should not recommend Chrome - as good as it is, but it has serious privacy problems.
Out of curiosity, why do we need to "recommend" a browser at all, and why do we think anyone will listen to our "recommendation"? People use the browser they use. If the site they want to go to doesn't work in their browser, they'll either not go there, or possibly try another one. They're certainly not going to change browsers just because the site told them to.
Personally, I use Chrome, FF and IE. And the main reason for switching is just to have different sets of cookies. Occasionally a site doesn't like Chrome, so I switch. But it's not like I'm going to take a "your experience would be better in <browser>" statement seriously.
Steve
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
On Wed, Jul 8, 2009 at 2:15 PM, Michael Dalemdale@wikimedia.org wrote:
We need to inform people that the quality of experience can be substantially improved if they use a browser that supports free formats. Wikimedia only distributes content in free formats because if you have to pay for a licensee to view, edit or publish ~free content~ then the content is not really ~free~.
We have requested that Apple and IE support free formats but they have chosen not to. Therefore we are in a position where we have to recommend a browser that does have a high quality user experience in supporting the formats. We are still making every effort to display the formats in IE & Safari using java or plugins but we should inform people they can have an improved experience on par with proprietary solutions if they are using different browser.
People should at least be informed that it matters.
I doubt most people are like Steve, running and trying different browsers and discovering on their own which works best.
2009/7/8 Michael Dale mdale@wikimedia.org:
We have requested that Apple and IE support free formats but they have chosen not to. Therefore we are in a position where we have to recommend a browser that does have a high quality user experience in supporting the formats. We are still making every effort to display the formats in IE & Safari using java or plugins but we should inform people they can have an improved experience on par with proprietary solutions if they are using different browser.
A method that doesn't say "your browser sucks" but shows it:
"You are using Safari without XiphQT. Install the Ogg codecs _here_ for a greatly improved Wikimedia experience." "You are using Internet Explorer. Install the Ogg codecs _here_ for a greatly improved Wikimedia experience."
The first linking to XiphQT, the second to Ogg DirectShow.
- d.
David Gerard wrote:
A method that doesn't say "your browser sucks" but shows it:
"You are using Safari without XiphQT. Install the Ogg codecs _here_ for a greatly improved Wikimedia experience." "You are using Internet Explorer. Install the Ogg codecs _here_ for a greatly improved Wikimedia experience."
The first linking to XiphQT, the second to Ogg DirectShow.
Internet Explorer does not support the video tag, installing Ogg DirectShow filters does not help there.
j
2009/7/8 j@v2v.cc:
David Gerard wrote:
"You are using Internet Explorer. Install the Ogg codecs _here_ for a greatly improved Wikimedia experience."
Internet Explorer does not support the video tag, installing Ogg DirectShow filters does not help there.
Yes, I realised this just after sending my email :-)
I presume, though, there's some way of playing videos in IE. Is there a way to tell if the Ogg filters are installed?
- d.
On Wed, Jul 8, 2009 at 3:06 PM, David Gerarddgerard@gmail.com wrote:
2009/7/8 j@v2v.cc:
David Gerard wrote:
"You are using Internet Explorer. Install the Ogg codecs _here_ for a greatly improved Wikimedia experience."
Internet Explorer does not support the video tag, installing Ogg DirectShow filters does not help there.
Yes, I realised this just after sending my email :-)
I presume, though, there's some way of playing videos in IE. Is there a way to tell if the Ogg filters are installed?
Java or via the VLC plugin
At least the safari + xiphqt has the benefit of working as well as firefox 3.5 does. The same is not true for Java or VLC. (the VLC plugin is reported to cause many browser crashes, Java is slow to launch and somewhat CPU hungry)
I've suggested making the same installer for XiphQT for win32 also install the XiphDS plugins, which would make things easier on users. But XiphDS does not help with in-browser playback today.
Since, at the moment, firefox is the only non-beta browser with direct support I don't see why plugging Firefox would be controversial. It's a matter of fact that it works best with Firefox 3.5 or Safari+XiphQT. Later when there are several options things will be a little more complicated. Certainly I don't think any recommendation should be made when the user already has native-grade playback.
2009/7/8 Gregory Maxwell gmaxwell@gmail.com:
Since, at the moment, firefox is the only non-beta browser with direct support I don't see why plugging Firefox would be controversial. It's a matter of fact that it works best with Firefox 3.5 or Safari+XiphQT. Later when there are several options things will be a little more complicated. Certainly I don't think any recommendation should be made when the user already has native-grade playback.
Yeah. Recommend XiphQT or Firefox to Safari users, Firefox to all others.
iPhone users, get 'em to call Apple? Seriously, what polite wording do we use to politely get across to iPhone users that this is 100% Apple's express decision to break Ogg video?
Probably an idea to run this past foundation-l for sanity checking - this would be perceived by the outside world as Wikimedia getting involved (which it most assuredly is).
- d.
You use quicktime + Xiph quicktime components (ie the codec)
-Mike
On Wed, 2009-07-08 at 20:06 +0100, David Gerard wrote:
2009/7/8 j@v2v.cc:
David Gerard wrote:
"You are using Internet Explorer. Install the Ogg codecs _here_ for a greatly improved Wikimedia experience."
Internet Explorer does not support the video tag, installing Ogg DirectShow filters does not help there.
Yes, I realised this just after sending my email :-)
I presume, though, there's some way of playing videos in IE. Is there a way to tell if the Ogg filters are installed?
- d.
Okay, first thoughts:
On Mon, Jul 6, 2009 at 11:54 PM, Aryeh Gregor <Simetrical+wikilist@gmail.comSimetrical%2Bwikilist@gmail.com
wrote:
It's clear at this point that HTML 5 will be the next version of HTML. It was obvious for a long time that XHTML was going nowhere, but now it's official: the XHTML working group has been disbanded and work on all non-HTML 5 variants of HTML has ceased. (Source: http://www.w3.org/2009/06/xhtml-faq.html)
That page clearly says that there will be an XHTML 5. XHTML is not going away.
- Delete '<meta http-equiv="Content-Style-Type" content="text/css"
/>'. Which is a really stupid element anyway. :P
- Delete name attributes from all <a> elements. They've been
redundant to id for eternity, and every browser in the universe supports id; we can finally move these to the headers themselves.
- Remove comments from inside <script> tags with a src attribute. I
already did this in r52828, since they're pointless anyway.
Good ideas.
* We can use HTML 5 form attributes. These will enhance the
experience of users of appropriate browsers, and do nothing for others. At least Opera 9.6x already supports almost all HTML 5 form attributes. (Source: http://www.opera.com/docs/specs/presto211/forms/) We could, for instance, give required fields the "required" attribute, which will cause the browser to prevent the form submission and notify the user if they aren't filled in, without needing either JavaScript or a server-side check.
What's to prevent a malicious user from manually posting an invalid submission? If there are no server-side checks, will the servers crash?
- Once this goes live, if no problems arise, try causing an XML
well-formedness error. For instance, remove the quote marks around one attribute of an element that's included in every page. I suggest this as a separate step because I suspect there are some bot operators who are doing screen-scraping using XML libraries, so it would be a good idea to assess how feasible it is at the present time to stop being well-formed. In the long run, of course, those bot operators should switch to using the API. If we receive enough complaints once this goes live, we can revert it and continue to ship HTML 5 that's also well-formed XML, for the time being.
Why be cruel to our bot operators? XHTML is simpler and more consistent than tag soup HTML, and it's a lot easier to find a good XML parser than a good HTML parser.
So, while I see some benefit to switching to HTML 5, I'd prefer to use XHTML 5 instead.
On 07/07/2009, at 7:37 AM, Remember the dot wrote:
Okay, first thoughts:
On Mon, Jul 6, 2009 at 11:54 PM, Aryeh Gregor <Simetrical+wikilist@gmail.comSimetrical%2Bwikilist@gmail.com
wrote:
It's clear at this point that HTML 5 will be the next version of HTML. It was obvious for a long time that XHTML was going nowhere, but now it's official: the XHTML working group has been disbanded and work on all non-HTML 5 variants of HTML has ceased. (Source: http://www.w3.org/2009/06/xhtml-faq.html)
That page clearly says that there will be an XHTML 5. XHTML is not going away.
- We can use HTML 5 form attributes. These will enhance the
experience of users of appropriate browsers, and do nothing for others. At least Opera 9.6x already supports almost all HTML 5 form attributes. (Source: http://www.opera.com/docs/specs/presto211/forms/) We could, for instance, give required fields the "required" attribute, which will cause the browser to prevent the form submission and notify the user if they aren't filled in, without needing either JavaScript or a server-side check.
What's to prevent a malicious user from manually posting an invalid submission? If there are no server-side checks, will the servers crash?
... Or from using a browser that doesn't support them. We're obviously not going to be removing server-side checks in favour of client-side checks, that's stupid. We're adding client-side checks to enhance usability.
- Once this goes live, if no problems arise, try causing an XML
well-formedness error. For instance, remove the quote marks around one attribute of an element that's included in every page. I suggest this as a separate step because I suspect there are some bot operators who are doing screen-scraping using XML libraries, so it would be a good idea to assess how feasible it is at the present time to stop being well-formed. In the long run, of course, those bot operators should switch to using the API. If we receive enough complaints once this goes live, we can revert it and continue to ship HTML 5 that's also well-formed XML, for the time being.
Why be cruel to our bot operators? XHTML is simpler and more consistent than tag soup HTML, and it's a lot easier to find a good XML parser than a good HTML parser.
They should be using the API.
So, while I see some benefit to switching to HTML 5, I'd prefer to use XHTML 5 instead.
You've given one benefit of XHTML5, which is negated by the fact that we provide the API for a consistent machine-readable interface, and the benefits to HTML5 that Aryeh has outlined. What other advantages are there?
-- Andrew Garrett Contract Developer, Wikimedia Foundation agarrett@wikimedia.org http://werdn.us
On Tue, Jul 7, 2009 at 2:37 AM, Remember the dotrememberthedot@gmail.com wrote:
That page clearly says that there will be an XHTML 5. XHTML is not going away.
By "XHTML" I meant "the family of standards including XHTML 1.0, 1.1, 2.0, etc.". XHTML 5 is identical to HTML 5 except with a different serialization. Practically speaking, however, it looks like no one will use XHTML 5 either, because it's impossible to deploy on the current web. (See below.) As far as I can tell, it was thrown in as a sop to XML fans, on the basis that it cost very little to add it to the spec (given the definition in terms of DOM plus serializations), without any expectation that anyone will use it in practice.
What's to prevent a malicious user from manually posting an invalid submission? If there are no server-side checks, will the servers crash?
Obviously there will be server-side checks as well! This will just serve to inform the user immediately that they're missing a required field, without having to wait for the server or use JavaScript.
Why be cruel to our bot operators? XHTML is simpler and more consistent than tag soup HTML, and it's a lot easier to find a good XML parser than a good HTML parser.
Because it will make the markup easier to read and write for humans, and smaller. Things like leaving off superfluous closing elements do not make for "tag soup". One of the great features of HTML 5 is that it very carefully defines the text/html parsing model in painstaking backward-compatible detail. For example, the description of unquoted attributes is as follows:
"The attribute name, followed by zero or more space characters, followed by a single U+003D EQUALS SIGN character, followed by zero or more space characters, followed by the attribute value, which, in addition to the requirements given above for attribute values, must not contain any literal space characters, any U+0022 QUOTATION MARK (") characters, U+0027 APOSTROPHE (') characters, U+003D EQUALS SIGN (=) characters, U+003C LESS-THAN SIGN (<) characters, or U+003E GREATER-THAN SIGN (>) characters, and must not be the empty string.
"If an attribute using the unquoted attribute syntax is to be followed by another attribute or by one of the optional U+002F SOLIDUS (/) characters allowed in step 6 of the start tag syntax above, then there must be a space character separating the two." http://dev.w3.org/html5/spec/Overview.html#attributes
Given that browsers need to implement all these complicated algorithms anyway, there's no reason to prohibit the use of convenient shortcuts for authors. They're absolutely well-defined, and even if they're more complicated for machines to parse, they're easier for humans to use than the theoretically simpler XML rules.
Anyway. Bots should not be scraping the site. They should be using the bot API, which is *vastly* easier to parse for useful data than any variant of HTML or XHTML. We could use this as an opportunity to push bot operators toward using the API -- screen-scraping has always been fragile and should be phased out anyway. Bot operators who screen-scrape will already break on other significant changes anyway; how many screen-scrapers will keep working when Vector becomes the default skin?
So I view the added difficulty of screen-scraping as a long-term side benefit of switching to HTML 5, like validation failures for presentational elements. It makes behavior that was already undesirable more *obviously* undesirable.
Clearly we can't break all the bots, though. So try breaking XML well-formedness. If there are only a few isolated complaints, go ahead with it. If it causes large-scale breakage, revert and tell all the bot operators to switch to the API, then try again in a few months or a year. Or when we enable Vector, which will probably break all the bots anyway.
So, while I see some benefit to switching to HTML 5, I'd prefer to use XHTML 5 instead.
XHTML 5, by definition, must be served under an XML MIME type. Anything served as text/html is not XHTML 5, and is required to be an HTML (not XHTML) serialization. We cannot serve content under non-text/html MIME types, because that would break IE, so we can't use XHTML 5. Even if we could, it would still be a bad idea. In XHTML 5, as in all XML, well-formedness errors are fatal. And we can't ensure that well-formedness errors are impossible without rewriting a lot of the parser *and* UI code.
We can, however, serve HTML 5 that happens to also be well-formed XML. This will allow XML parsers to be used, and is what I propose we do to start with.
On Tue, Jul 7, 2009 at 2:48 AM, Gregory Maxwellgmaxwell@gmail.com wrote:
What do you think we're doing now? A jpeg 'poster' is displayed. When the user clicks the poster is replaced by the appropriate playback mechanism.
I'm confused. What we're currently doing (correct me if I'm wrong) is displaying a JPEG <img> as a poster, and replacing it via JavaScript with the appropriate content when it's clicked. What we should do, ideally, is use something like <video src=foo.ogg poster=bar.jpg>, which will cause the poster to be displayed in place of the video on conformant browsers (including Firefox 3.6, but not 3.5). Of course, the <img> can be put in the fallback content for the <video>.
I said it needed to be weighed, not that the weighing would come out any particular way. I'm a fan of using Video natively. The fact that it makes save-page work the way it should is really great.
Okay, great.
I'm not sure how you think it currently works but there is currently zero need to load cortado for HTML5 supporting browsers.
I was probably confused about what "Cortado" is -- apparently it's only the Java-based player, not the whole JavaScript framework? I never looked into our implementation of this very much. Anyway, the point is we won't have to load the JavaScript logic even if the user does have JavaScript enabled, which is a plus.
Great, looks like HTML5 vs. XHTML fight is infecting everything.
Just my 2 cents - I don't think that switching to new not yet W3C Recomendation is a good idea - many extensions and features are not yet finished (e.g. RDFa support for it) and considering a huge commotion in this area it might not be a very good decision.
Thank you,
Sergey
-- Sergey Chernyshev http://www.sergeychernyshev.com/
On Tue, Jul 7, 2009 at 9:38 AM, Aryeh Gregor <Simetrical+wikilist@gmail.comSimetrical%2Bwikilist@gmail.com
wrote:
On Tue, Jul 7, 2009 at 2:37 AM, Remember the dotrememberthedot@gmail.com wrote:
That page clearly says that there will be an XHTML 5. XHTML is not going away.
By "XHTML" I meant "the family of standards including XHTML 1.0, 1.1, 2.0, etc.". XHTML 5 is identical to HTML 5 except with a different serialization. Practically speaking, however, it looks like no one will use XHTML 5 either, because it's impossible to deploy on the current web. (See below.) As far as I can tell, it was thrown in as a sop to XML fans, on the basis that it cost very little to add it to the spec (given the definition in terms of DOM plus serializations), without any expectation that anyone will use it in practice.
What's to prevent a malicious user from manually posting an invalid submission? If there are no server-side checks, will the servers crash?
Obviously there will be server-side checks as well! This will just serve to inform the user immediately that they're missing a required field, without having to wait for the server or use JavaScript.
Why be cruel to our bot operators? XHTML is simpler and more consistent
than
tag soup HTML, and it's a lot easier to find a good XML parser than a
good
HTML parser.
Because it will make the markup easier to read and write for humans, and smaller. Things like leaving off superfluous closing elements do not make for "tag soup". One of the great features of HTML 5 is that it very carefully defines the text/html parsing model in painstaking backward-compatible detail. For example, the description of unquoted attributes is as follows:
"The attribute name, followed by zero or more space characters, followed by a single U+003D EQUALS SIGN character, followed by zero or more space characters, followed by the attribute value, which, in addition to the requirements given above for attribute values, must not contain any literal space characters, any U+0022 QUOTATION MARK (") characters, U+0027 APOSTROPHE (') characters, U+003D EQUALS SIGN (=) characters, U+003C LESS-THAN SIGN (<) characters, or U+003E GREATER-THAN SIGN (>) characters, and must not be the empty string.
"If an attribute using the unquoted attribute syntax is to be followed by another attribute or by one of the optional U+002F SOLIDUS (/) characters allowed in step 6 of the start tag syntax above, then there must be a space character separating the two." http://dev.w3.org/html5/spec/Overview.html#attributes
Given that browsers need to implement all these complicated algorithms anyway, there's no reason to prohibit the use of convenient shortcuts for authors. They're absolutely well-defined, and even if they're more complicated for machines to parse, they're easier for humans to use than the theoretically simpler XML rules.
Anyway. Bots should not be scraping the site. They should be using the bot API, which is *vastly* easier to parse for useful data than any variant of HTML or XHTML. We could use this as an opportunity to push bot operators toward using the API -- screen-scraping has always been fragile and should be phased out anyway. Bot operators who screen-scrape will already break on other significant changes anyway; how many screen-scrapers will keep working when Vector becomes the default skin?
So I view the added difficulty of screen-scraping as a long-term side benefit of switching to HTML 5, like validation failures for presentational elements. It makes behavior that was already undesirable more *obviously* undesirable.
Clearly we can't break all the bots, though. So try breaking XML well-formedness. If there are only a few isolated complaints, go ahead with it. If it causes large-scale breakage, revert and tell all the bot operators to switch to the API, then try again in a few months or a year. Or when we enable Vector, which will probably break all the bots anyway.
So, while I see some benefit to switching to HTML 5, I'd prefer to use
XHTML
5 instead.
XHTML 5, by definition, must be served under an XML MIME type. Anything served as text/html is not XHTML 5, and is required to be an HTML (not XHTML) serialization. We cannot serve content under non-text/html MIME types, because that would break IE, so we can't use XHTML 5. Even if we could, it would still be a bad idea. In XHTML 5, as in all XML, well-formedness errors are fatal. And we can't ensure that well-formedness errors are impossible without rewriting a lot of the parser *and* UI code.
We can, however, serve HTML 5 that happens to also be well-formed XML. This will allow XML parsers to be used, and is what I propose we do to start with.
On Tue, Jul 7, 2009 at 2:48 AM, Gregory Maxwellgmaxwell@gmail.com wrote:
What do you think we're doing now? A jpeg 'poster' is displayed. When the user clicks the poster is replaced by the appropriate playback mechanism.
I'm confused. What we're currently doing (correct me if I'm wrong) is displaying a JPEG <img> as a poster, and replacing it via JavaScript with the appropriate content when it's clicked. What we should do, ideally, is use something like <video src=foo.ogg poster=bar.jpg>, which will cause the poster to be displayed in place of the video on conformant browsers (including Firefox 3.6, but not 3.5). Of course, the <img> can be put in the fallback content for the <video>.
I said it needed to be weighed, not that the weighing would come out any particular way. I'm a fan of using Video natively. The fact that it makes save-page work the way it should is really great.
Okay, great.
I'm not sure how you think it currently works but there is currently zero need to load cortado for HTML5 supporting browsers.
I was probably confused about what "Cortado" is -- apparently it's only the Java-based player, not the whole JavaScript framework? I never looked into our implementation of this very much. Anyway, the point is we won't have to load the JavaScript logic even if the user does have JavaScript enabled, which is a plus.
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
On Tue, Jul 7, 2009 at 2:29 PM, Sergey Chernyshevsergey.chernyshev@gmail.com wrote:
Just my 2 cents - I don't think that switching to new not yet W3C Recomendation is a good idea - many extensions and features are not yet finished (e.g. RDFa support for it)
Much of the spec is very stable. We would not be using any part that's likely to change -- in most cases, only parts that have multiple interoperable implementations. Such parts of the spec will not change significantly; that's a basic principle of most W3C specs' development processes (and HTML 5's in particular).
We use other W3C specs that nominally aren't stable, e.g., some parts of CSS. We used plenty of CSS 2.1 when that was still nominally a Working Draft. We use multi-column layout (at least in our content on enwiki) even though that's a Working Draft. Etc. Given the way the W3C works, it's not reasonable at all to require that the *whole* spec be a Candidate Recommendation or whatever. You can make a feature-by-feature stability assessment pretty easily in most cases: if it has multiple interoperable implementations, it's stable and can be used; if it doesn't, it's not very useful anyway, so who cares?
and considering a huge commotion in this area it might not be a very good decision.
There is no more commotion. XHTML 2.0 is officially dead. The working group is disbanded. HTML 5 is the only version of HTML that is being developed.
I don't think you've raised any substantive objections here. *Practically* speaking, what reason is there not to begin moving to HTML 5 now?
On Tue, Jul 7, 2009 at 2:46 PM, Aryeh GregorSimetrical+wikilist@gmail.com wrote:
Much of the spec is very stable. We would not be using any part that's likely to change -- in most cases, only parts that have multiple interoperable implementations. Such parts of the spec will not change significantly; that's a basic principle of most W3C specs' development processes (and HTML 5's in particular).
To elaborate on this, from the WHATWG FAQ:
"Different parts of the specification are at different maturity levels. Some sections are already relatively stable and there are implementations that are already quite close to completion, and those features can be used today (e.g. <canvas>). But other sections are still being actively worked on and changed regularly, or not even written yet.
"You can see annotations in the margins showing the estimated stability of each section. . . .
"The point to all this is that you shouldn’t place too much weight on the status of the specification as a whole. You need to consider the stability and maturity level of each section individually." http://wiki.whatwg.org/wiki/FAQ#When_will_HTML_5_be_finished.3F
"When will we be able to start using these new features?
"As soon as browsers begin to support them. You do not need to wait till HTML5 becomes a recommendation, because that can’t happen until after the implementations are completely finished.
"For example, the <canvas> feature is already widely implemented.
"The specification has annotations in the margins showing what browsers implement each section." http://wiki.whatwg.org/wiki/FAQ#When_will_we_be_able_to_start_using_these_ne...
I'm only considering the projects I was going to work on and can't talk for all the things MediaWiki team should have in mind - I was going to add support for RDFa (http://www.w3.org/TR/rdfa-syntax/) which currently is W3C Recomendation, but only for XHTML and even though HTML profiles (or whatever they are called) are in the works they are not ready yet.
Switching to non-recomendation will mean that implementing RDFa in standard compliant form will have to be postponed for quite a while.
As for commotion I mentioned, I believe there is at least tension between RDFa world and "Microdata" world that is being pushed along HTML 5 spec.
Thank you,
Sergey
-- Sergey Chernyshev http://www.sergeychernyshev.com/
On Tue, Jul 7, 2009 at 2:46 PM, Aryeh Gregor <Simetrical+wikilist@gmail.comSimetrical%2Bwikilist@gmail.com
wrote:
On Tue, Jul 7, 2009 at 2:29 PM, Sergey Chernyshevsergey.chernyshev@gmail.com wrote:
Just my 2 cents - I don't think that switching to new not yet W3C Recomendation is a good idea - many extensions and features are not yet finished (e.g. RDFa support for it)
Much of the spec is very stable. We would not be using any part that's likely to change -- in most cases, only parts that have multiple interoperable implementations. Such parts of the spec will not change significantly; that's a basic principle of most W3C specs' development processes (and HTML 5's in particular).
We use other W3C specs that nominally aren't stable, e.g., some parts of CSS. We used plenty of CSS 2.1 when that was still nominally a Working Draft. We use multi-column layout (at least in our content on enwiki) even though that's a Working Draft. Etc. Given the way the W3C works, it's not reasonable at all to require that the *whole* spec be a Candidate Recommendation or whatever. You can make a feature-by-feature stability assessment pretty easily in most cases: if it has multiple interoperable implementations, it's stable and can be used; if it doesn't, it's not very useful anyway, so who cares?
and considering a huge commotion in this area it might not be a very good decision.
There is no more commotion. XHTML 2.0 is officially dead. The working group is disbanded. HTML 5 is the only version of HTML that is being developed.
I don't think you've raised any substantive objections here. *Practically* speaking, what reason is there not to begin moving to HTML 5 now?
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Aryeh Gregor wrote:
On Tue, Jul 7, 2009 at 2:37 AM, Remember the dotrememberthedot@gmail.com wrote:
Why be cruel to our bot operators? XHTML is simpler and more consistent than tag soup HTML, and it's a lot easier to find a good XML parser than a good HTML parser.
Because it will make the markup easier to read and write for humans, and smaller. Things like leaving off superfluous closing elements do not make for "tag soup". One of the great features of HTML 5 is that it very carefully defines the text/html parsing model in painstaking backward-compatible detail. For example, the description of unquoted attributes is as follows:
Technically HTML 4 is pretty much the same in this regard; it's 100% legitimate SGML and HTML 4 to skip implied opening and closing elements, drop quotes on attribute values that are unambiguous, etc.
HTML 5 is a little better I think in that it specifies which SGML short forms are required to be supported and which shouldn't (for instance few browsers support this SGML short form: <b/this is some bold text/).
The primary advantage of the XML formulation is that you can parse the document tree unambiguously *without* knowing the spec of the individual markup -- omitting implied values means the consumer needs to know what to expect.
Is this really a huge advantage when the impliable elements are well-known as in HTML? I dunno.
It can cause problems when a new element with implied behavior is added, as with WebKit's initial <canvas> implementation. (Apple implemented it as allowing an implied empty element, whereas Mozilla requires you to close it so it won't confuse parsers that don't know it should be empty and thus closed immediately.)
But as long as new markup extensions are used unambiguously, HTML 5 should be no more ambiguous and just as extensible as the XML formulation.
-- brion
At a minimum, I'm glad to see the dead-ended XHTML 2 working group officially killed; actual compatible implementations of ongoing work are happening in the HTML 5 world and that's where the future definitely is.
I don't see much need for us to stick with the XML formulation for the next generation, given that we've never actually served our XHTML 1 *marked* as application/html+xml for compatibility reasons:
* IE refuses to display any content usefully * Safari gets confused about character references * even Mozilla will have different JS behavior, which would require us to jump through some more hoops to kill the last document.write() calls... * not to mention that your entire web site becomes inaccessible instantly if you end up with a markup error in the page footer!
Unless we're embedding our XHTML into other XML streams (which we're not), there's little benefit to strictly sticking to the XML formulation for page output.
XML formulation could perhaps be useful if we migrate page text storage from custom markup to an HTML-based internal format, as we could then toss it at XML parsers without worrying. But that doesn't have any bearing on the HTML user interface we display to end-users in browsers.
-- brion
On Tue, Jul 7, 2009 at 4:46 PM, Brion Vibberbrion@wikimedia.org wrote:
Technically HTML 4 is pretty much the same in this regard; it's 100% legitimate SGML and HTML 4 to skip implied opening and closing elements, drop quotes on attribute values that are unambiguous, etc.
Not entirely. HTML 4 doesn't allow you to omit quotes on attribute values that contain non-name characters, for instance, at least according to the W3C validator -- so you need quotes for all URLs, for example. These aren't necessary either in practice, or in HTML 5. I'm pretty sure the requirements for opening and closing elements are stricter in HTML 4 as well.
HTML 5 tends to loosen things up to whatever all browsers support, which is usually more lenient than what HTML 4 formally allows. It also actually specifies what constructs must be supported, in painstaking detail, so you can figure out what's legal without dumping it in a validator and hoping the validator's correct . . .
On Tue, Jul 7, 2009 at 4:49 PM, Brion Vibberbrion@wikimedia.org wrote:
At a minimum, I'm glad to see the dead-ended XHTML 2 working group officially killed; actual compatible implementations of ongoing work are happening in the HTML 5 world and that's where the future definitely is.
I don't see much need for us to stick with the XML formulation for the next generation, given that we've never actually served our XHTML 1 *marked* as application/html+xml for compatibility reasons:
- IE refuses to display any content usefully
- Safari gets confused about character references
- even Mozilla will have different JS behavior, which would require us
to jump through some more hoops to kill the last document.write() calls...
- not to mention that your entire web site becomes inaccessible
instantly if you end up with a markup error in the page footer!
Unless we're embedding our XHTML into other XML streams (which we're not), there's little benefit to strictly sticking to the XML formulation for page output.
XML formulation could perhaps be useful if we migrate page text storage from custom markup to an HTML-based internal format, as we could then toss it at XML parsers without worrying. But that doesn't have any bearing on the HTML user interface we display to end-users in browsers.
Does that mean "go ahead and begin switching to HTML 5 now", or what?
Aryeh Gregor wrote:
On Tue, Jul 7, 2009 at 4:46 PM, Brion Vibberbrion@wikimedia.org wrote:
Technically HTML 4 is pretty much the same in this regard; it's 100% legitimate SGML and HTML 4 to skip implied opening and closing elements, drop quotes on attribute values that are unambiguous, etc.
Not entirely. HTML 4 doesn't allow you to omit quotes on attribute values that contain non-name characters, for instance, at least according to the W3C validator -- so you need quotes for all URLs, for example. These aren't necessary either in practice, or in HTML 5. I'm pretty sure the requirements for opening and closing elements are stricter in HTML 4 as well.
Remember Postel's robustness principle (paraphrased):
be conservative in what you send, liberal in what you accept
If quotes are always permitted, then always send the quotes.
If closing tags are always permitted, then always send the tags.
The browsers will handle them, and we don't need to worry about the flavor of browser.
There's no need to over-optimize the output. The intended viewer isn't human, and we're not talking about enough extra characters that very slow links will be congested....
HTML 5 tends to loosen things up to whatever all browsers support, which is usually more lenient than what HTML 4 formally allows. It also actually specifies what constructs must be supported, in painstaking detail, so you can figure out what's legal without dumping it in a validator and hoping the validator's correct . . .
Great. Does that mean HTML 5 browsers will still accept formal HTML 4?
Then, let's stick to the "stricter" interpretation.
On Tue, Jul 7, 2009 at 4:49 PM, Brion Vibberbrion@wikimedia.org wrote:
XML formulation could perhaps be useful if we migrate page text storage from custom markup to an HTML-based internal format, as we could then toss it at XML parsers without worrying. But that doesn't have any bearing on the HTML user interface we display to end-users in browsers.
Does that mean "go ahead and begin switching to HTML 5 now", or what?
My thought is that the 5 tags that are marked as well-supported could be used, but be very cautious about abandoning 4. There are a lot of old machines out there, and many cannot upgrade to newer browsers, because they cannot upgrade their underlying operating systems.
For example: schools, already heavy *pedia users. And political campaigns often use cast-off machines. Win98 or 2K means no upgrades.
On Wed, Jul 08, 2009 at 10:58:43AM -0400, William Allen Simpson wrote:
My thought is that the 5 tags that are marked as well-supported could be used, but be very cautious about abandoning 4. There are a lot of old machines out there, and many cannot upgrade to newer browsers, because they cannot upgrade their underlying operating systems.
For example: schools, already heavy *pedia users. And political campaigns often use cast-off machines. Win98 or 2K means no upgrades.
I don't think there is any suggestion that backwards compatibility should be broken. MediaWiki is a project which has strived to keep full compatibility with most browsers since its creation.
Robert
On Wed, Jul 8, 2009 at 10:58 AM, William Allen Simpsonwilliam.allen.simpson@gmail.com wrote:
Remember Postel's robustness principle (paraphrased):
be conservative in what you send, liberal in what you accept
This applies only if there's some reason to be conservative. There's no reason to arbitrarily send only a subset of possible markup if every browser that supports that subset will support the full range of markup as well. Restricting ourselves to HTML 4 based on the principle of being conservative in what we send makes no more sense than restricting ourselves to, I don't know, class names that contain only the letters z, f, and q. It won't increase compatibility -- it's just a pointless inconvenience.
If quotes are always permitted, then always send the quotes.
If closing tags are always permitted, then always send the tags.
The browsers will handle them, and we don't need to worry about the flavor of browser.
There is *no* flavor of browser that requires quotes or closing tags. None. I'd be willing to bet there's not a single one released in the last ten years that ever attained more than 0.1% overall market share, say. Any browser that did things like requiring quotes or closing tags would *completely* *break* the web. It wouldn't display a majority of websites correctly, and nobody would use it. This is a fact.
(In any event, HTML 4 doesn't require either quotes or closing tags in all circumstances, although it requires them more often than HTML 5 does.)
There's no need to over-optimize the output. The intended viewer isn't human, and we're not talking about enough extra characters that very slow links will be congested....
We're talking about a few percent difference in size, for almost no effort on our part. And I would say that it definitely is a slight plus if the HTML is more human-readable. What are the concrete downsides?
Great. Does that mean HTML 5 browsers will still accept formal HTML 4?
Then, let's stick to the "stricter" interpretation.
All browsers are "HTML 5 browsers" in the sense of not requiring quotes or closing tags when HTML 5 doesn't require them. Those parts of HTML 5 are just reverse-engineered from existing behavior that's been de facto standard since early in the IE-Netscape wars, at least.
My thought is that the 5 tags that are marked as well-supported could be used, but be very cautious about abandoning 4. There are a lot of old machines out there, and many cannot upgrade to newer browsers, because they cannot upgrade their underlying operating systems.
For example: schools, already heavy *pedia users. And political campaigns often use cast-off machines. Win98 or 2K means no upgrades.
Nothing I have proposed will have even the smallest negative impact on anyone's ability to view Wikipedia in a web browser, even with very old browsers. The only negative effect will be on non-web-browser users, who we don't want screen-scraping anyway.
On Wed, Jul 8, 2009 at 11:15 AM, Sergey Chernyshevsergey.chernyshev@gmail.com wrote:
I'm only considering the projects I was going to work on and can't talk for all the things MediaWiki team should have in mind - I was going to add support for RDFa (http://www.w3.org/TR/rdfa-syntax/) which currently is W3C Recomendation, but only for XHTML and even though HTML profiles (or whatever they are called) are in the works they are not ready yet.
Switching to non-recomendation will mean that implementing RDFa in standard compliant form will have to be postponed for quite a while.
I'm pretty sure this will be resolved within a matter of months, one way or another. Either Ian will cave and support RDFa, or RDFa will support HTML 5 (at least in a usable draft form) without HTML 5's explicit agreement, or microdata will gain support as wide as RDFa. At worst, you can still use MW 1.15 while things are being worked out. Or maybe we could provide a switch to allow HTML 5 or XHTML, but I'm leery of that, since it negates most of the benefits.
I admit that I don't follow RDF and "semantic web" stuff too closely, so I'm not very qualified to address this objection. I'm pretty sure that RDFa support is not an issue for the overwhelming majority of our users, however. On the other hand, improved <video> support and better form handling for a significant percentage of our users are examples of clear and concrete benefits from HTML 5.
Is this actually a *practical* problem even for the very small number of users who want to use RDFa? I mean, will RDFa really not work with HTML 5 in practice, or will it work but it's not standardized?
As for commotion I mentioned, I believe there is at least tension between RDFa world and "Microdata" world that is being pushed along HTML 5 spec.
Yes, there definitely is tension there! Just not between HTML 5 and XHTML 2 -- that's over, even if a few people might not have gotten the message yet. I don't know what will happen with RDFa vs. microdata. I find it unlikely that anyone will convince Ian to include RDFa at this point with just arguments. But if it sees much wider adoption than microdata, he'd probably include it.
On Wed, Jul 8, 2009 at 11:15 AM, Sergey Chernyshevsergey.chernyshev@gmail.com wrote:
I'm only considering the projects I was going to work on and can't talk
for
all the things MediaWiki team should have in mind - I was going to add support for RDFa (http://www.w3.org/TR/rdfa-syntax/) which currently is
W3C
Recomendation, but only for XHTML and even though HTML profiles (or
whatever
they are called) are in the works they are not ready yet.
Switching to non-recomendation will mean that implementing RDFa in
standard
compliant form will have to be postponed for quite a while.
I'm pretty sure this will be resolved within a matter of months, one way or another. Either Ian will cave and support RDFa, or RDFa will support HTML 5 (at least in a usable draft form) without HTML 5's explicit agreement, or microdata will gain support as wide as RDFa. At worst, you can still use MW 1.15 while things are being worked out. Or maybe we could provide a switch to allow HTML 5 or XHTML, but I'm leery of that, since it negates most of the benefits.
I admit that I don't follow RDF and "semantic web" stuff too closely, so I'm not very qualified to address this objection. I'm pretty sure that RDFa support is not an issue for the overwhelming majority of our users, however. On the other hand, improved <video> support and better form handling for a significant percentage of our users are examples of clear and concrete benefits from HTML 5.
I see your point - video is clearly more popular then RDFa and if you're willing to go off-standard to support it, it's might be a reasonable decision for a site like Wikipedia. Not sure what is the rush for that and why can't it wait till HTML 5 spec becomes a recommendation.
I'm not that familiar with HTML 5 support in modern browsers to state that there are going to be regressions with some other things, but it might be another thing to consider, although Wikipedia might be big enough to be a driving force in such decisions.
Is this actually a *practical* problem even for the very small number of users who want to use RDFa? I mean, will RDFa really not work with HTML 5 in practice, or will it work but it's not standardized?
Sorry, can't give you a definitive answer - CCing RDFa list for this.
Guys, will be happy if you provide where RDFa support stands here.
As for commotion I mentioned, I believe there is at least tension between
RDFa world and "Microdata" world that is being pushed along HTML 5 spec.
Yes, there definitely is tension there! Just not between HTML 5 and XHTML 2 -- that's over, even if a few people might not have gotten the message yet. I don't know what will happen with RDFa vs. microdata. I find it unlikely that anyone will convince Ian to include RDFa at this point with just arguments. But if it sees much wider adoption than microdata, he'd probably include it.
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
On Wed, Jul 8, 2009 at 5:41 PM, Sergey Chernyshevsergey.chernyshev@gmail.com wrote:
I see your point - video is clearly more popular then RDFa and if you're willing to go off-standard to support it, it's might be a reasonable decision for a site like Wikipedia. Not sure what is the rush for that and why can't it wait till HTML 5 spec becomes a recommendation.
The HTML 5 FAQ has useful info on this:
http://wiki.whatwg.org/wiki/FAQ
See especially "When will HTML 5 be finished?" and "When will we be able to start using these new features?" HTML 5 likely will not reach even *Candidate* Recommendation stage until 2012, and might take until 2020 or later to get to Recommendation status. It's a very large spec, and there's absolutely no reason not to use the parts that are fully fleshed out, implemented, and stable just because some other parts are less stable. As I said before, we've always done this with CSS; and this is the official position of the ones responsible for writing and implementing the HTML 5 specification.
I'm not that familiar with HTML 5 support in modern browsers to state that there are going to be regressions with some other things, but it might be another thing to consider, although Wikipedia might be big enough to be a driving force in such decisions.
We are talking about using only polished, finalized features that are implemented in stable browsers which have undergone considerable testing. Unless you can point out specific problems that might arise, there's no reason to anticipate more risk of unexpected problems with HTML 5 than with any other new type of functionality we deploy. There might hypothetically be some useful things that work in XHTML 1 but not HTML 5, but there are *definitely* a number of useful things that work in HTML 5 but not XHTML 1, which have already been outlined in this thread. (And in practice, the WHATWG has made a point of incorporating all unequivocally useful features from XHTML in some form.)
Keep in mind that changing MediaWiki to output valid HTML 5 (modulo GIGO) instead of XHTML 1 on a normal page view would probably take under 20 lines of code changes. I could do it in five minutes. This is an *extremely* small change. Each specific feature of HTML 5 that we use after that can be evaluated for deployment on a case-by-case basis, just as we'd evaluate any other new technology like web fonts or RDFa. If problems do arise from switching to HTML 5 per se, it would be easy to change back to XHTML.
Apparently something ate my last post here. (I think it was my Chromium nightly build.) Okay, reposting from memory:
After discussion with Brion on IRC, I've provisionally enabled an HTML 5 doctype in r53034:
http://www.mediawiki.org/wiki/Special:Code/MediaWiki/53034
My thoughts on what we should do in the immediate future are:
1) Get at least the enwiki Main Page set up so it will validate as HTML 5 when we scap: http://validator.w3.org/check?uri=http://en.wikipedia.org/wiki/&charset=(detect+automatically)&doctype=HTML5&group=0
1a) Remove border="0" from Wikimedia's $wgCopyrightIcon (it does nothing anyway).
1b) Rope some enwiki sysops into getting rid of all cellpadding, cellspacing, align, and clear attributes on the Main Page (converting them to CSS).
2) Scap (whenever this happens -- maybe not so immediate future :) ).
3) Wait a couple of hours to see if anything breaks.
4) Make a tech blog post and post a notice to the whatwg list (I'll do this). We'll have our front page validating as HTML 5 at this point, hopefully, to make a more positive impact.
5) See what happens!
I expect this will pick up some interest, since we'll probably be increasing the number of HTML 5 page views by a factor of -- oh, ten thousand? (Is there any top *1000* site that uses HTML 5 for all its primary content?) We can see how things develop, and if all goes well start using more HTML 5 features.
I'd recommend that until the code goes live, this should be considered an *experimental* *development* change. People shouldn't go around announcing this everywhere until it's actually live. For one thing, some unknown problem might crop up and we'd have to temporarily roll back, which would cause confusion and bad press for both us and HTML 5. For another thing, it would be nice if we could link to a validating main page in the announcement. I'm sure people can hold off posting stories to Slashdot for a week or two, right? :)
2009/7/10 Aryeh Gregor Simetrical+wikilist@gmail.com:
1b) Rope some enwiki sysops into getting rid of all cellpadding, cellspacing, align, and clear attributes on the Main Page (converting them to CSS).
*waves*
I'll forward your post to wikien-l too.
Give us a list of what to do.
- Make a tech blog post and post a notice to the whatwg list (I'll do
this). We'll have our front page validating as HTML 5 at this point, hopefully, to make a more positive impact. 5) See what happens! I expect this will pick up some interest, since we'll probably be increasing the number of HTML 5 page views by a factor of -- oh, ten thousand? (Is there any top *1000* site that uses HTML 5 for all its primary content?) We can see how things develop, and if all goes well start using more HTML 5 features.
WIN!
- d.
On Fri, Jul 10, 2009 at 10:26 AM, David Gerarddgerard@gmail.com wrote:
*waves*
I'll forward your post to wikien-l too.
Give us a list of what to do.
Here:
http://en.wikipedia.org/wiki/Talk:Main_Page#Requested_adjustments_to_Main_Pa...
It mainly needs testing across various browsers. I already did some, and I'm reasonably confident that it will work modulo a pixel of whitespace here and there, but it's hard to be entirely sure without actually deploying it (especially since it's possible I made an error in copying it from my local machine to Wikipedia).
I support using html 5 new features, but I don't like the idea of starting to strip tags "just because we can". Currently MediaWiki does quite a good work on it. I don't see a reason to start removing tags. Yes, allegdely there's an space improvement but still... Perhaps we should also look into alternative solutions like SDCH.
Aryeh Gregor wrote:
Apparently something ate my last post here. (I think it was my Chromium nightly build.) Okay, reposting from memory:
After discussion with Brion on IRC, I've provisionally enabled an HTML 5 doctype in r53034:
I see the "Attribute name not allowed on element a at this point." has been taken care of at r52963 Interestingly, it had been removed in r38323 and readded by Simetrical in r45418.
My thoughts on what we should do in the immediate future are:
- Get at least the enwiki Main Page set up so it will validate as
HTML 5 when we scap: http://validator.w3.org/check?uri=http://en.wikipedia.org/wiki/&charset=(detect+automatically)&doctype=HTML5&group=0
On Fri, Jul 10, 2009 at 8:04 PM, PlatonidesPlatonides@gmail.com wrote:
I support using html 5 new features, but I don't like the idea of starting to strip tags "just because we can". Currently MediaWiki does quite a good work on it. I don't see a reason to start removing tags. Yes, allegdely there's an space improvement but still...
It's something to consider. It will improve not only space, but also readability. Here's the doctype and <head> for http://aryeh.name/, in valid HTML 5:
<!doctype html> <link rel=stylesheet href=/css/main.css> <title>Risen from Prey ✡ מטרף עלה</title> <!--[if IE]><script src=/html5ie.js></script><![endif]-->
That's it. Here's what it would have to be in XHTML 1.0 Transitional:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <link rel="stylesheet" href="/css/main.css" type="text/css" /> <title>Risen from Prey ✡ מטרף עלה</title> <!--[if IE]><script src="/html5ie.js" type="text/javascript"></script><![endif]--> </head>
And that's even omitting the extra <meta> tags I'd need to use if I had inline style and script, which would make it:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <link rel="stylesheet" href="/css/main.css" type="text/css" /> <title>Risen from Prey ✡ מטרף עלה</title> <!--[if IE]><script src="/html5ie.js" type="text/javascript"></script><![endif]--> <meta http-equiv="Content-Style-Type" contents="text/css" /> <meta http-equiv="Content-Script-Type" contents="text/javascript" /> </head>
Look at those two side by side for a minute, the first and the third, and tell me there's no reason to go with the first one if there's demonstrably no difference in how browsers treat them. Improving legibility for human readers of our HTML source isn't a *major* goal, but I don't think we should disregard it entirely, especially when there are modest size improvements to be had as well. The only reason I can think of to avoid it other than "leave well enough alone" is for the sake of screen-scraping bots.
Perhaps we should also look into alternative solutions like SDCH.
SDCH is not going to be usable anytime in the foreseeable future, AFAICT.
I see the "Attribute name not allowed on element a at this point." has been taken care of at r52963 Interestingly, it had been removed in r38323 and readded by Simetrical in r45418.
r45418 didn't re-add it, actually, it was r38328.
:)
Aryeh Gregor wrote:
Look at those two side by side for a minute, the first and the third, and tell me there's no reason to go with the first one if there's demonstrably no difference in how browsers treat them. Improving legibility for human readers of our HTML source isn't a *major* goal, but I don't think we should disregard it entirely, especially when there are modest size improvements to be had as well. The only reason I can think of to avoid it other than "leave well enough alone" is for the sake of screen-scraping bots.
OK, I've looked. I'm certainly no expert in hand editing html, although I've done more than enough over the years, but I just don't see the problem that's being solved.
Many/most pages already serve up more than 32K. You're proposing a tiny savings of fractional percentages in bytes, all so it's more legible to humans that never actually see it and aren't about to edit this stuff.
You know I've agreed with you more often than not over the years, and I've never cared much about screen scraping bots after the API worked, but is this really worth the effort?
I'm of the opinion that compatibility with old browsers is much more important than human readability.
Do you have copies of W98 and W2K to regression test against?
On Sun, Jul 12, 2009 at 2:43 PM, William Allen Simpsonwilliam.allen.simpson@gmail.com wrote:
OK, I've looked. I'm certainly no expert in hand editing html, although I've done more than enough over the years, but I just don't see the problem that's being solved.
Many/most pages already serve up more than 32K. You're proposing a tiny savings of fractional percentages in bytes, all so it's more legible to humans that never actually see it and aren't about to edit this stuff.
Some humans do see it, namely, developers and similar sorts. People writing CSS and JS, for instance. There's value in readable code for debugging purposes, all else being equal.
You know I've agreed with you more often than not over the years, and I've never cared much about screen scraping bots after the API worked, but is this really worth the effort?
It wouldn't be much effort, especially over time.
I'm of the opinion that compatibility with old browsers is much more important than human readability.
This is much less likely to introduce compatibility problems than many other changes we make, e.g., to CSS or JS. Parsing of HTML has been pretty uniform for years except in edge cases. There have been no new features since HTML 4, after all.
Do you have copies of W98 and W2K to regression test against?
Unnecessary. It's seriously unlikely we have acceptable support for any browser before IE5 anyway, and that can be run on modern systems (I use ies4linux). In practice, we haven't gone out of our way to support browsers older than IE6 for a long time now. If someone brings up an issue, we'll consider fixing it, but we're not proactively hunting down browsers that old and trying to work around their bugs. Nobody is; cost-benefit just isn't reasonable.
In fact, I find that the Wikipedia main page is almost completely unreadable in IE5 already. I have never seen a single complaint about that, not even once.
I've made a mediawiki.org page discussing what features we might use:
I've discovered that changing the doctype does actually cause some slight rendering differences. I don't think this will be a big delay, but I've disabled HTML 5 doctypes by default (in r53137) until I can figure out the issue and resolve it.
On Sun, Jul 12, 2009 at 1:58 PM, Aryeh GregorSimetrical+wikilist@gmail.com wrote:
I've discovered that changing the doctype does actually cause some slight rendering differences. I don't think this will be a big delay, but I've disabled HTML 5 doctypes by default (in r53137) until I can figure out the issue and resolve it.
Issue fixed in r53141, and $wgHtml5 re-enabled by default in r53142. However, it turns out browsers don't treat the doctypes *exactly* the same, only *almost* exactly (as that issue demonstrated). The HTML 5 doctype triggers slightly more standards-compliant behavior for recent major browsers in some details of CSS -- which is ultimately good, but we need to be on the lookout for anything it breaks. This page has more info on doctype switches in browsers:
2009/7/12 Aryeh Gregor Simetrical+wikilist@gmail.com:
Issue fixed in r53141, and $wgHtml5 re-enabled by default in r53142. However, it turns out browsers don't treat the doctypes *exactly* the same, only *almost* exactly (as that issue demonstrated). The HTML 5 doctype triggers slightly more standards-compliant behavior for recent major browsers in some details of CSS -- which is ultimately good, but we need to be on the lookout for anything it breaks. This page has more info on doctype switches in browsers:
You should probably write this up for whatwg - "practical gotchas in moving a large site to HTML5."
- d.
On Sun, Jul 12, 2009 at 2:41 PM, David Gerarddgerard@gmail.com wrote:
You should probably write this up for whatwg - "practical gotchas in moving a large site to HTML5."
It was already public knowledge, just that knowledge didn't extend to me. :) It was only triggered in any event by what amounts to a bug in MediaWiki. I'll certainly mention it, anyway.
2009/7/12 Aryeh Gregor Simetrical+wikilist@gmail.com:
On Sun, Jul 12, 2009 at 2:41 PM, David Gerarddgerard@gmail.com wrote:
You should probably write this up for whatwg - "practical gotchas in moving a large site to HTML5."
It was already public knowledge, just that knowledge didn't extend to me. :) It was only triggered in any event by what amounts to a bug in MediaWiki. I'll certainly mention it, anyway.
I was surprised and pleased that discussion of <video> in the real world was triggered by our efforts to make stuff Just Work, so I expect a writeup of what the #4 website had to do will be of great interest to the browser engineers reading!
- d.
It turns out it is not easy so early in HTML 5 history, but after looking http://www.google.com.tw/search?q=validate+HTML5&ie=utf-8&oe=utf-8 I finally found http://www.w3.org/QA/2008/08/html5-validator-beta an HTML 5 validator http://qa-dev.w3.org/wmvs/HEAD/ and proceeded to find genuine invalid MediaWiki HTML 5 invalidities http://qa-dev.w3.org/wmvs/HEAD/check?uri=http://transgender-taiwan.org/&...
On Mon, Jul 13, 2009 at 10:40 AM, jidanni@jidanni.org wrote:
It turns out it is not easy so early in HTML 5 history, but after looking http://www.google.com.tw/search?q=validate+HTML5&ie=utf-8&oe=utf-8 I finally found http://www.w3.org/QA/2008/08/html5-validator-beta an HTML 5 validator http://qa-dev.w3.org/wmvs/HEAD/ and proceeded to find genuine invalid MediaWiki HTML 5 invalidities http://qa-dev.w3.org/wmvs/HEAD/check?uri=http://transgender-taiwan.org/&...
You can just use validator.w3.org.
Aryeh Gregor Simetrical+wikilist@gmail.com wrote:
I support using html 5 new features, but I don't like the idea of starting to strip tags "just because we can". Currently MediaWiki does quite a good work on it. I don't see a reason to start removing tags. Yes, allegdely there's an space improvement but still...
It's something to consider. It will improve not only space, but also readability. Here's the doctype and <head> for http://aryeh.name/, in valid HTML 5:
[...]
Look at those two side by side for a minute, the first and the third, and tell me there's no reason to go with the first one if there's demonstrably no difference in how browsers treat them. Improving legibility for human readers of our HTML source isn't a *major* goal, but I don't think we should disregard it entirely, especially when there are modest size improvements to be had as well. The only reason I can think of to avoid it other than "leave well enough alone" is for the sake of screen-scraping bots. [...]
I don't know what Platonides' point was specifically but personally I find "hanging" tags (e. g. lacking close tags) very error-prone. I think if one has to explicitly close elements the probability of a "missed" one (that leaves text bold till kingdom^Wthe next paragraph starts) reduces dras- tically. Same goes for attributes in '"'s - if you put them around all your attributes, you do not have to think about whether each single attribute has a value that needs them.
So, while you could save some bytes in this process, you'd have to spend much more time in testing.
Tim
On Mon, Jul 13, 2009 at 2:52 PM, Tim Landscheidttim@tim-landscheidt.de wrote:
I don't know what Platonides' point was specifically but personally I find "hanging" tags (e. g. lacking close tags) very error-prone. I think if one has to explicitly close elements the probability of a "missed" one (that leaves text bold till kingdom^Wthe next paragraph starts) reduces dras- tically.
Not all tags in HTML 5 self-close, only some. <b>, for instance, must be explicitly closed, so you can't get bold running to the end of the paragraph. It's generally only block-level tags that auto-close, and it makes no sense to ever close those before the next block begins (which is when they auto-close). You aren't going to write:
<p>Foo bar <p>Baz
and actually mean:
<p>Foo</p> bar <p>Baz</p>
That would frequently be invalid anyway.
Same goes for attributes in '"'s - if you put them around all your attributes, you do not have to think about whether each single attribute has a value that needs them.
We can have the logic happen automatically in an Html class, like we do with our Xml class. For manually-added values there's little to no issue: it's extremely obvious when a string needs quotes.
Even if you use quotes, as in XHTML, you have to be careful to make sure your content doesn't have the same type of quote as the value you're adding. We've had XSS vulnerabilities because htmlspecialchars() escapes only ", not '. That line of false security will be less attractive if things like spaces break the attribute values too. You'd realize more quickly that you need to use Html::attr() or whatever we cook up, and htmlspecialchars() is not enough.
Aryeh Gregor wrote:
On Fri, Jul 10, 2009 at 8:04 PM, PlatonidesPlatonides@gmail.com wrote:
I support using html 5 new features, but I don't like the idea of starting to strip tags "just because we can". Currently MediaWiki does quite a good work on it. I don't see a reason to start removing tags. Yes, allegdely there's an space improvement but still...
It's something to consider. It will improve not only space, but also readability. Here's the doctype and <head> for http://aryeh.name/, in valid HTML 5:
<!doctype html>
<link rel=stylesheet href=/css/main.css> <title>Risen from Prey ✡ מטרף עלה</title> <!--[if IE]><script src=/html5ie.js></script><![endif]-->
I find it too flat, beheaded, unstructured... It may be a sympton of too much html engineering but I wonder if others feel the same.
Perhaps we should also look into alternative solutions like SDCH.
SDCH is not going to be usable anytime in the foreseeable future, AFAICT.
AFAIK it's implemented on Chrome and IE with Gears extension.
On Mon, Jul 13, 2009 at 4:42 PM, PlatonidesPlatonides@gmail.com wrote:
Aryeh Gregor wrote:
SDCH is not going to be usable anytime in the foreseeable future, AFAICT.
AFAIK it's implemented on Chrome and IE with Gears extension.
Thus not usable for the overwhelming majority of our viewers, and not likely to become so in the foreseeable future.
wikitech-l@lists.wikimedia.org