Aryeh Gregor wrote:
See bug 10721. Id's should not differ only in case: this apparently causes problems with IE, which treats them case-insensitively, and is also allegedly required or recommended by XHTML (a comment in Parser.php on line 3619 says this, but I can't find it in the spec).
Well, that rather puts a damper on that one, then. Too bad. :(
I would take two steps to ensure that conflicts are avoided:
- Prohibit user-provided id's from beginning with "mw-". Since all
new id's should have been using this for a long time now, it should cut out a lot of possibility for conflict. Also add in other prefixes that are used, like "ca-", "p-", "n-", "page-".
A simpler, though more heavy-handed, approach might be simply to disallow dashes in section anchors (presumably replacing them with ".2D"). This would leave any dashed IDs free for other uses.
- Start making a list of all the old, deprecated style of id's that
we use, and manually check against all of them with in_array(). We can start with some of the really common ones like "content" and let people create a more comprehensive list if they can be bothered, or as specific conflicts arise.
We might be able to adapt the existing code in the parser that prevents duplicate section anchors from occurring: just prepopulate the list of already seen anchors with any IDs that occur in skins.
Now, how about figuring out how to get manually-specified id's like <span id="foo"> to be unique? :)
The big problem there is that there's no sensible way to disambiguate them. Probably the best we can do would be to silently drop any duplicates. Or maybe replace them with a really in-your-face "disambiguation" like OMG-DUPLICATE-ID-whatever-FIX-IT-YOU-IDIOT-1. :)
Anyway, this kind of sounds like something Tidy ought to be doing. Doesn't it?