Wikitech-l July 2009

wikitech-l@lists.wikimedia.org

104 participants
110 discussions

Proposal: switch to HTML 5
by Aryeh Gregor 13 Jul '09

13 Jul '09

HTML 5 is the up-and-coming version of the HTML standard, which supports all sorts of new and exciting features. For those who don't know about it, here's some background: Wikipedia article: http://en.wikipedia.org/wiki/HTML_5 Summary of major differences from HTML 4: http://www.w3.org/TR/html5-diff/ Full specification: http://dev.w3.org/html5/spec/Overview.html It's clear at this point that HTML 5 will be the next version of HTML. It was obvious for a long time that XHTML was going nowhere, but now it's official: the XHTML working group has been disbanded and work on all non-HTML 5 variants of HTML has ceased. (Source: <http://www.w3.org/2009/06/xhtml-faq.html>) MediaWiki will have to switch to HTML 5 sooner or later. It's a great standard, and I think we would do well to be early on the curve here and help spark interest in and support for it. HTML 5 is designed to be backward-compatible with legacy content, both on the authoring side and (especially) the implementation side. Well-written XHTML 1.0 should theoretically need only minor modifications to validate as HTML 5, and indeed this appears to be the case in practice. All that's required to get a typical page in Monobook validating as HTML 5 in the W3C's experimental validator is (*if* we disregard user-added markup): * Change the doctype to "<!doctype html>". * Delete '<meta http-equiv="Content-Style-Type" content="text/css" />'. Which is a really stupid element anyway. :P * Delete name attributes from all <a> elements. They've been redundant to id for eternity, and every browser in the universe supports id; we can finally move these to the headers themselves. * Remove comments from inside <script> tags with a src attribute. I already did this in r52828, since they're pointless anyway. (The W3C validator is at http://validator.w3.org/. You can override the doctype and set it to interpret Wikipedia URLs as HTML 5 under "More Options".) Note that HTML 5 does follow in the "strict" vein of XHTML. Presentational elements and attributes such as font, border, cellpadding, etc. are all invalid in HTML 5. (Implementations must support them, but conforming documents must not use them. <b> and <i> remain valid.) There's very little of this stuff left in the HTML that ships with the software. We can remove this incrementally as it's reported. For user-added content, I think it's fair to just treat it as GIGO -- if they submit invalid content that can't be easily converted to a valid form, it will be output as-is. Users can already submit invalid content in cases where we can't easily fix it, e.g., duplicate id's. If we switch to HTML 5, the W3C validator will begin outputting errors on this presentational stuff, which should hopefully encourage users to reduce it over time, at least in high-profile places like the front page or infoboxes. So converting to HTML 5 would be trivial. However, in addition to lending our support to good standards, there are several modest practical benefits that would accrue from the switch. I include here only things that are possible in valid HTML 5 documents, but which would not validate as XHTML 1 (so excluding stuff like localStorage); and which are usable right now (so excluding stuff like <nav>, <input type=color>, etc.): * HTML 5 permits omission of a lot of the cruft that XHTML requires. It permits leaving off ending tags in most cases where that's unambiguous, and leaving off some required tags entirely (such as <html>, <head>, and <body> if they have no attributes). The "/>" ending is no longer required. Superfluous attributes like type=text/javascript on <script> are no longer needed (unless you want to use <script type=application/x-python or something, of course!). Quotes may be omitted from attributes in almost all cases. The doctype is shorter and easy to remember, and there is no xmlns attribute. For an example of how compact valid HTML 5 can be, look at the source of http://aryeh.name/. I once did a crude test and found we could cut 5% or so off the length of our HTML by doing this -- *after* gzipping. Not only does this make our code smaller, it will also make it easier to read. * We could support <video>/<audio> on conformant user agents without the use of JavaScript. There's no reason we should need JS for Firefox 3.5, Chrome 3, etc. * We can use data-* attributes to store custom data for scripts. This came up in the case of the HTML diff work: the author of that stuck some data for scripts in custom attributes, which caused XHTML 1 validation to fail. * We can use HTML 5 form attributes. These will enhance the experience of users of appropriate browsers, and do nothing for others. At least Opera 9.6x already supports almost all HTML 5 form attributes. (Source: <http://www.opera.com/docs/specs/presto211/forms/>) We could, for instance, give required fields the "required" attribute, which will cause the browser to prevent the form submission and notify the user if they aren't filled in, without needing either JavaScript or a server-side check. The "pattern" attribute even allows requiring that the input match a regex, and this is also supported by Opera 9.6x. See <http://dev.w3.org/html5/spec/Overview.html#common-input-element-attributes>. * There are a couple of parser tests that currently fail because of misnested tags. If we altered the parser to no longer output any </p> tags (which HTML 5 permits), these tests would immediately pass. It doesn't look like anyone's going to fix them otherwise. These are only a few of the things that have immediate concrete benefit. There are probably more I couldn't find immediately (HTML 5 is a huge spec), and of course in the long term there's an incredible amount that would be invaluable to us. I propose the following migration plan: 1) Fix the doctype, Content-Style-Type, and name attributes. We can then officially claim we're shipping HTML 5! :) (Albeit maybe invalid in some cases.) Also remove any unnecessary attributes and elements, without breaking XML well-formedness. Begin using HTML 5 form attributes and any other useful features. Poke the Cortado people about letting <video> work without JavaScript. 2) Once this goes live, if no problems arise, try causing an XML well-formedness error. For instance, remove the quote marks around one attribute of an element that's included in every page. I suggest this as a separate step because I suspect there are some bot operators who are doing screen-scraping using XML libraries, so it would be a good idea to assess how feasible it is at the present time to stop being well-formed. In the long run, of course, those bot operators should switch to using the API. If we receive enough complaints once this goes live, we can revert it and continue to ship HTML 5 that's also well-formed XML, for the time being. 3) If XML well-formedness is not a problem, get rid of all unneeded closing tags, quotation marks, self-closing "/>" constructs, etc. Create an Html class like Xml, which will generate elements in the nice compact form that HTML 5 permits, and phase out use of Xml in favor of Html. (Xml has long since ceased to be purely about XML anyway.) So, what are people's thoughts?

18 59

MediaWiki security update: 1.15.1 and 1.14.1
by Tim Starling 13 Jul '09

13 Jul '09

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 This is a security and bugfix release of MediaWiki 1.15.1 and 1.14.1. A cross-site scripting (XSS) vulnerability was discovered in [[Special:Block]]. Only versions 1.14.0, 1.15.0 and release candidates for those releases are affected. Cross-site scripting vulnerabilities allow an unprivileged attacker to gain administrator access to the wiki by tricking an administrator into viewing a page which emits a malicious script. The malicious script may also be able to gain privileged access to other applications on the same domain. Other changes in these releases: 1.15.1: * Fixed fatal errors for unusual file repository configurations, such as ForeignAPIRepo. * Fixed the "change password" link on Special:Preferences to have the correct returnto parameter. 1.14.1: * (bug 17737) Fixed russian URLs for Special:BookSources * (bug 17713) Using links with only an anchor no longer add an dummy entry in the pagelinks table * (bug 17897) Fixed string offset error in <pre> tags * (bug 17832) Fixed action=delete returning 'unknownerror' instead of 'permissiondenied' when the user is blocked * Fixed performance regression when accessing deleted (archived) files Upgrade FAQ: http://www.mediawiki.org/wiki/Manual:FAQ#Upgrading ********************************************************************** 1.14.1 ********************************************************************** Download: http://download.wikimedia.org/mediawiki/1.14/mediawiki-1.14.1.tar.gz Patch to previous version (1.14.0), without interface text: http://download.wikimedia.org/mediawiki/1.14/mediawiki-1.14.1.patch.gz Interface text changes: http://download.wikimedia.org/mediawiki/1.14/mediawiki-i18n-1.14.1.patch.gz GPG signatures: http://download.wikimedia.org/mediawiki/1.14/mediawiki-1.14.1.tar.gz.sig http://download.wikimedia.org/mediawiki/1.14/mediawiki-1.14.1.patch.gz.sig http://download.wikimedia.org/mediawiki/1.14/mediawiki-i18n-1.14.1.patch.gz… Public keys: https://secure.wikimedia.org/keys.html ********************************************************************** 1.15.1 ********************************************************************** Download: http://download.wikimedia.org/mediawiki/1.15/mediawiki-1.15.1.tar.gz Patch to previous version (1.15.0): http://download.wikimedia.org/mediawiki/1.15/mediawiki-1.15.1.patch.gz GPG signatures: http://download.wikimedia.org/mediawiki/1.15/mediawiki-1.15.1.tar.gz.sig http://download.wikimedia.org/mediawiki/1.15/mediawiki-1.15.1.patch.gz.sig Public keys: https://secure.wikimedia.org/keys.html -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAkpbgkoACgkQdWgrCOij/sRAOgCgwk2XTXrxMkRrxsxNsAZj2EGK CC0AoJ78EAOW0rGxs+K1NjFO59XfS1RS =ZcRE -----END PGP SIGNATURE-----

1 0

Proposal/RFC: Checksum of revision text
by Aaron L Halfaker 13 Jul '09

13 Jul '09

I propose that an additional checksum of the revision text be added to the mediawiki database and that this checksum be made available via the database dumps and api calls. This additional field would allow many computations such as revert and noop detection without having to ask the system to provide the full text of revisions. For example, if I were to build a user script to show users which revisions have been reverted, it would be beneficial to not have to ask the API for the full text of a large list of revisions. On that same note, even when I need the full text of revisions, I could determine which revisions I do not need to request by determining that their content is exactly the same as one that has already been retrieved. It does not seem that such a field would require considerably more storage or computational power since computing an MD5 checksum in PHP is cheap and storing 32 hex characters compared to the size of an articles text is not appreciable. Thanks, -Aaron Halfaker

6 9

Bugzilla Weekly Report
by reporter＠isidore.wikimedia.org 13 Jul '09

13 Jul '09

MediaWiki Bugzilla Report for July 06, 2009 - July 13, 2009 Status changes this week Bugs NEW : 150 Bugs ASSIGNED : 8 Bugs REOPENED : 22 Bugs RESOLVED : 107 Total bugs still open: 3880 Resolutions for the week: Bugs marked FIXED : 65 Bugs marked REMIND : 0 Bugs marked INVALID : 7 Bugs marked DUPLICATE : 19 Bugs marked WONTFIX : 11 Bugs marked WORKSFORME : 2 Bugs marked LATER : 4 Bugs marked MOVED : 0 Specific Product/Component Resolutions & User Metrics New Bugs Per Component Maps 12 Site requests 12 AbuseFilter 5 General/Unknown 4 Mailing lists 4 New Bugs Per Product MediaWiki 30 Wikimedia 24 MediaWiki extensions 31 Wiktionary tools 1 Wikipedia Mobile 1 Top 5 Bug Resolvers roan.kattouw [AT] gmail.com 23 tparscal [AT] wikimedia.org 17 avarab [AT] gmail.com 9 Bryan.TongMinh [AT] Gmail.com 6 markus [AT] semantic-mediawiki.org 6

1 0

who is in charge of the Wikimedia Technical Blog?
by jidanni＠jidanni.org 12 Jul '09

12 Jul '09

Who is in charge of the Wikimedia Technical Blog? Consider adding a "contact" link on your blog. However, make sure it doesn't go through the same filter as the following problem: After each comment I post, logged in or not, it says: Your location has been identified as part of a reported spam network. Comments have been disabled to prevent spam. Yesterday from 122.127.*, Today from 218.163.* https://bugzilla.wikimedia.org/show_bug.cgi?id=19540 http://lists.wikimedia.org/pipermail/wikitech-l/2009-July/043834.html

1 0

just a note...
by Domas Mituzas 12 Jul '09

12 Jul '09

Hi! an interesting fact from operations.. ;-) a template change on commons.wikimedia.org grew the database by 50% in a week or so, pushing all s3 servers close to 100% disk use... *ouch* Domas

3 4

Inconsistent <math> markup fonts within a single text section
by Michael Kaufman 11 Jul '09

11 Jul '09

Sorry if this is not the correct forum to report a possible bug. I noticed browsing the "Example of second-order singular perturbation theory" section of the "Perturbation Theory" page on wikipedia: --- Consider the following equation for the unknown variable <math>x</math>: :<math>x=1+\epsilon x^5.</math> For the initial problem with <math>\epsilon=0</math>, the solution is <math>x_0=1</math>. For small <math>\epsilon</math> the lowest order approximation may be found by inserting the [[ansatz]] :<math>x=x_0+\epsilon x_1 (+\cdots)</math> -- that the markup fonts of "x" in each of the :<math> sections are different. In particular there are no serifs on the top x's in the first equation and x_0=1 on my Linux box. On the others, there are small serifs. I don't see why they should be rendering differently. The ascii certainly doesn't indicate anything to me... I've tested this on: IE7.0 Windows Safari 4.0 on OSX 10.5 Firefox 3.5 on OSX 10.5 Firefox 3.0.10 on Linux 2.6.18-128.1.10.el5 I've attached a small screenshot of what I see. M

3 4

Is this the right list to ask questions about parserTests
by dan nessett 11 Jul '09

11 Jul '09

I don't want to irritate people by asking inappropriate questions on this list. So please direct me to the right list if this is the wrong one for this question. I ran parserTests and 45 tests failed. The result was: Passed 559 of 604 tests (92.55%)... 45 tests failed! I expect this indicates a problem, but sometimes test suites are set up so certain tests fail. Is this result good or bad? Dan

2 1

How do you run the parserTests?
by dan nessett 11 Jul '09

11 Jul '09

I have been struggling to figure out how to run the parser tests. From the very limited documentation in the code, it appears you are supposed to run them from a terminal. However, when I cd to the maintenance directory and type "php parserTests.php" I get the following error message. Parse error: parse error, expecting `T_OLD_FUNCTION' or `T_FUNCTION' or `T_VAR' or `'}'' in /Users/dnessett/Sites/Mediawiki/maintenance/parserTests.inc on line 43 Either there is some setup necessary that I haven't done; parserTests.php is not the appropriate "top-level" target for the execution; you are not supposed to run these tests from the terminal; or there is something else I am doing wrong. I tried to find documentation on how to run the tests without success. If I simply haven't looked in the right place, a quick pointer to the appropriate instructions would be great. Otherwise, I wonder if someone could instruct me how to run them. Thanks, Dan

3 2

Re: [Wikitech-l] How do you run the parserTests?
by dan nessett 11 Jul '09

11 Jul '09

--- On Fri, 7/10/09, Aryeh Gregor <Simetrical+wikilist(a)gmail.com> wrote: > From: Aryeh Gregor <Simetrical+wikilist(a)gmail.com> > Subject: Re: [Wikitech-l] How do you run the parserTests? > To: "Wikimedia developers" <wikitech-l(a)lists.wikimedia.org> > Date: Friday, July 10, 2009, 3:01 PM > On Fri, Jul 10, 2009 at 5:46 PM, dan > nessett<dnessett(a)yahoo.com> > wrote: > > MediaWiki 1.14.0 > > PHP 5.2.6 (apache2handler) > > MySQL 5.0.41 > > > > ... > > private $color; > > I'm going to bet that php on your command line is actually > PHP 4, not > PHP 5. Try php -v to check this. Using "php5 > parserTests.php" might > work. > Good guess. It appears I have both php4 and php5 installed. Thanks. Dan

1 0

← Newer
1
...
5
6
7
8
9
10
11
Older →

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Wikitech-l July 2009