On Sun, Dec 28, 2008 at 9:42 AM, Ilmari Karonen nospam@vyznev.net wrote:
A point I haven't seen anyone bring up yet is that, if we're messing with section anchors anyway, this would be a good time to try to make them less likely to collide with other IDs on the page.
For example, now that we've forced all section anchors to start with an ASCII letter, we could go one step further and force them to start with an _uppercase_ ASCII letter. Since most if not all of our other ID attributes start with a lowercase letter, this would conveniently place the two in separate namespaces.
See bug 10721. Id's should not differ only in case: this apparently causes problems with IE, which treats them case-insensitively, and is also allegedly required or recommended by XHTML (a comment in Parser.php on line 3619 says this, but I can't find it in the spec).
I would take two steps to ensure that conflicts are avoided:
1) Prohibit user-provided id's from beginning with "mw-". Since all new id's should have been using this for a long time now, it should cut out a lot of possibility for conflict. Also add in other prefixes that are used, like "ca-", "p-", "n-", "page-".
2) Start making a list of all the old, deprecated style of id's that we use, and manually check against all of them with in_array(). We can start with some of the really common ones like "content" and let people create a more comprehensive list if they can be bothered, or as specific conflicts arise.
(2) is kind of ugly, but there doesn't seem to be any better way to do it.
Now, how about figuring out how to get manually-specified id's like <span id="foo"> to be unique? :)