Just a couple of comments on this
Rob Lanphier wrote:
Ok, that's good. What about the ramifications for relative URL handling in RFC 2396? http://www.ietf.org/rfc/rfc2396
I haven't found any immediate problems, but it would take me a while reading through the BNF to figure out if there are places where it breaks.
Not that it's a huge deal if relative URLs don't work, since MW can always just stick to absolute references, but its one area where things can go wrong.
Just a comment in passing: RFC 3986 obsoletes RFC 2396
http://www.ietf.org/rfc/rfc3986.txt
The one specific prohibition of double slash in the path is at the start of path when the URI has no authority segment.
but it does fight typical conventions, which is kind of a bad thing. For example, it appears that Apache throws away extra slashes, as can be seen here: http://apache.org///foundation////faq.html http://apache.org/foundation/faq.html IIS seems to do the same thing: http://www.microsoft.com////windowsserversystem///default.mspx
I assure you that Apache does not throw away extra slashes. I have already done the necessary programming to do URLs such as those I have mentioned. The examples you mentioned don't say anything about the webservers themselves because both URLs are obviously mappings to a filesystem (whether virtual or not); it is that filesystem that throws away the extra slashes (which you can easily test: Both Linux and Windows allow you to put double-/ resp. double-\ in a path and it won't complain).
Compare: http://www.livejournal.com/manage/index.bml and http://www.livejournal.com/manage//index.bml
They show the same page because the path is a mapping to a filesystem, but the pages are different because the individual strings on it are retrieved from codes that are based on the path. Those codes contain only single slashes, so the second page is missing those strings. This clearly shows that it's the filesystem and not Apache that "throws away" double-slashes.
Ok, that's good.
Still, I maintain that assigning unique semantics to "//" versus "/" when used in that part of a URL doesn't have a lot of precedent, which also means that there's probably a lot of places it can break. I admit that's a vague criticism, but I just have a bad gut feeling about going down that road.
FWIW, I share your bad gut feeling. Superfluous slashes in the path may be used as an evasion technique, and some security packages (for example, mod_security for Apache) normalize the path, stripping out the extra slashes. While this particular case might not pertain to our installation at this time, it is an example of the kind of unforseen problems that may lie down this road.