On ĵaŭ, 2002-12-26 at 00:33, Jonathan Walther wrote:
On Wed, Dec 25, 2002 at 11:32:50PM -0800, Brion Vibber wrote:
The backend implementation is not relevant. By using mod_rewrite, the URL can be in any format we like with the PHP system. But that's no excuse for making URLs long, confusing, and fragile.
True. What I meant was that by bypassing mod_rewrite, we might no longer need a special patch for Apache?
Sure.
http://foo/w/Whatlinkshere#ns=Special&target=A%26W+Root+Beer&limit=5...
Standard URL syntax provides for a query string (starting with "?"), we shouldn't be afraid to use it.
Ok. Let's say I have an article "Who Killed JFK?" and I want to view the articles history. I want to type in the URL, and I can't remember the hex code for '?'. What do I do? Just require people to use the hex code anyway for cases where the ? is part of the article title?
See RFC 2396, which defines the structure of URIs. http://www.ietf.org/rfc/rfc2396.txt
That would break up as: (scheme)(authority)(path)(query) (http)(foo)(/w/Who_Killed_JFK)(?action=history&limit=10)
(Technically the question mark is "reserved" in the contents of a query string, and I'm not sure it's allowed except as the separator from the path.)
Question mark is simply *not* a valid URL _path_ character, there's nothing we can do about that that doesn't flaunt standards and break things, any more than we can decide we want a domain name with a slash or a colon in it. ;)
Also, I'm not clear; what "escaping" does mod_rewrite do? How does it determine whether to escape the ? and & or not? When does it do the escaping?
Before Apache gives the path to mod_rewrite, Apache has already normalized escaped (I should say URL-encoded; %26 etc) characters. So a path of "/wiki/A%26W_soda" comes to our mod_rewrite rules as "/wiki/A&W_soda". If we take it and put it raw into a query string: "title=A&W_soda", no good because that will break into "title=A" and "W_soda" when PHP interprets the query. That's why we add an explicit escaping for that character.
So, if I put in the following URL:
http://foo/w/Who_Killed_JFK%3F?action=history
Will mod_rewrite change the second ? to a %3F if I rewrite the URL somehow? Will it transform the %3F to a ? if I rewrite the URL somehow?
The second ? is interpreted as a separator between path and query before rewriting comes up.
By default if you create a new query string via a rewrite rule it wipes out any existing query string completely, replacing it. There's an option [QSA] to append instead; I haven't yet checked to see if it adds an ampersand separator or if that needs to be tweaked further.
I'd prefer: http://foo/en/Special:Whatlinkshere?target=A%26W+Root+Beer&limit=50
That's fair enough; do you have a page already written up giving your reasons for wanting languages to be part of a hierarchy, but not namespaces?
Our namespaces aren't really hierarchical: Talk:, User:, User_talk: etc... We have a flat namespace of namespaces. ;)
Or even yet, we could take advantage of the path syntax, as long as special pages are never named with slashes:
http://foo/en/Special:Whatlinkshere/A&W_Root_Beer?limit=50 http://foo/en/Special:Contributions/Billybob
I'm having difficulty understanding; could you show me what manglement would happen under other schemes, that doesn't happen under this one?
Somebody types: http://foo/en/Special:Whatlinkshere?target=A&W_Root_Beer
they get:
(http)(foo)(/en/Special:Whatlinkshere)((target=A)(W_Root_Beer))
Thus they see links to the page [[A]]. Question marks remain problematic in the other case, of course. You can't win. ;)
-- brion vibber (brion @ pobox.com)