Hello, it seems that after the blackout an Apache rewrite rule that allowed "short URLs" like this:
instead of
http://en.wikipedia.org/wiki/USA
is now missing. Or maybe it depends on which particular Apache server you hit. The first URL gives a 404.
Alfio
Alfio Puglisi wrote:
it seems that after the blackout an Apache rewrite rule that allowed "short URLs" like this:
instead of
http://en.wikipedia.org/wiki/USA
is now missing. Or maybe it depends on which particular Apache server you hit. The first URL gives a 404.
There was never an Apache rewrite rule for this. Erik had hacked it into the 404 error response page.
I've taken it out for the moment as a) like adding random domain names, it's a really bad idea to encourage a dependency on inherently unstable URLs, and b) there was an infinite looping problem.
-- brion vibber (brion @ pobox.com)
Brion Vibber schrieb:
I've taken it out for the moment as a) like adding random domain names, it's a really bad idea to encourage a dependency on inherently unstable URLs, and b) there was an infinite looping problem.
a) I don't see why these URLs should be, by necessity, inherently unstable. Unlike domain names, they don't require registration, they're purely software. They've been in use for a few months now without any issues besides the Apache problem you discovered. Where is the high maintenance requirement? In my opinion, it's bad to keep them disabled after they've been in use for several months already. This discussion, if it needs to take place, should have taken place when the short URLs were first introduced.
b) is fixed in the version on the server, to my knowledge. Am I missing something?
I understand that it's annoying to track down a weird bug like this in an obscure script, but I don't think that is sufficient reason to turn off the short URLs completely.
Peace,
Erik
Erik Moeller wrote:
Brion Vibber schrieb:
I've taken it out for the moment as a) like adding random domain names, it's a really bad idea to encourage a dependency on inherently unstable URLs, and b) there was an infinite looping problem.
a) I don't see why these URLs should be, by necessity, inherently unstable. Unlike domain names, they don't require registration, they're purely software.
As we add things on the servers or to the software, portions of the URL space become unusable for redirection because there really are files there.
b) is fixed in the version on the server, to my knowledge. Am I missing something?
Well, it's fixed now that I've had a chance to hack at it. The tests for URL regions where actual files are known to exist (which would also have prevented the infinite redirect loop) were *all* wrongly written and failed on the actual strings tested. (The tests required the path not to have a '/' at the start, but it always does.)
I understand that it's annoying to track down a weird bug like this in an obscure script, but I don't think that is sufficient reason to turn off the short URLs completely.
For the meantime I've got it using a timed refresh plus a clickable link. Still relatively convenient for the lazy, without lulling people into a false sense of security thinking it's a legit URL they should be able use, link, or advertise like here: http://wikimediafoundation.org/w/index.php?title=Talk:Home&diff=2048&...
-- brion vibber (brion @ pobox.com)
Brion Vibber <brion <at> pobox.com> writes:
For the meantime I've got it using a timed refresh plus a clickable link. Still relatively convenient for the lazy, without lulling people into a false sense of security thinking it's a legit URL they should be
Can the /wiki/ part not be removed removed completely form the url? The reason why the short is used is because it is more easy to write and if you use it in print, like on the English and Dutch leaflets or those Ads http://meta.wikimedia.org/wiki/Ads
a short url is much better.
Walter
On Wed, 2 Mar 2005 09:04:02 +0000 (UTC), Walter Vermeir walter@wikipedia.be wrote:
Can the /wiki/ part not be removed removed completely form the url?
No. As Brion rightly points out, there are, and always will be, URLs for things other than wiki pages. If, for instance, http://en.wikipedia.org/skins was the URL of a wiki page called "skins", where would the files (CSS, images, etc) for displaying the skin be stored? And even more so, if http://en.wikipedia.org/w referred to the article [[w]], all the URLs of the form http://en.wikipedia.org/w/index.php?title=... would instantly break (in fact, they would point to pages called things like [[w/index.php?title=...]]).
While we could have all sorts of exceptions and rearrangements and special cases, it would be an absolute maintenance nightmare, so I agree with Brion that these short URLs should be considered "incorrect" even if they work most of the time.
-----BEGIN PGP SIGNED MESSAGE-----
Moin,
On Wednesday 02 March 2005 16:50, Rowan Collins wrote:
On Wed, 2 Mar 2005 09:04:02 +0000 (UTC), Walter Vermeir
walter@wikipedia.be wrote:
Can the /wiki/ part not be removed removed completely form the url?
No. As Brion rightly points out, there are, and always will be, URLs for things other than wiki pages. If, for instance, http://en.wikipedia.org/skins was the URL of a wiki page called "skins", where would the files (CSS, images, etc) for displaying the skin be stored?
Er, maybe in http://en.wikipedia.org/skins/ ?
And even more so, if http://en.wikipedia.org/w referred to the article [[w]], all the URLs of the form http://en.wikipedia.org/w/index.php?title=... would instantly break (in fact, they would point to pages called things like [[w/index.php?title=...]]).
No, simple fix: don't redirect URLs with "/" in them. That would not allow a redirect for the article "skins/scans", but articles with "/" in them are probably in the minority.
In fact, without the check for "/", the redirect would also redirect "/wiki/article" to "/wiki/wiki/article", so it must be already in place..
While we could have all sorts of exceptions and rearrangements and special cases, it would be an absolute maintenance nightmare, so I agree with Brion that these short URLs should be considered "incorrect" even if they work most of the time.
One could also say that only wiki articles should live under the en.wikipedia.org namespace and everything else should be somewhere else, like files.wikipedia.org or skins.wikipedia.org etc.
Just my 0.02€,
Tels
- -- Signed on Wed Mar 2 17:59:14 2005 with key 0x93B84C15. Visit my photo gallery at http://bloodgate.com/photos/ PGP key on http://bloodgate.com/tels.asc or per email.
"Not King yet."
On Wed, 2 Mar 2005 18:01:58 +0100, Tels nospam-abuse@bloodgate.com wrote:
And even more so, if http://en.wikipedia.org/w referred to the article [[w]], all the URLs of the form http://en.wikipedia.org/w/index.php?title=... would instantly break (in fact, they would point to pages called things like [[w/index.php?title=...]]).
No, simple fix: don't redirect URLs with "/" in them. That would not allow a redirect for the article "skins/scans", but articles with "/" in them are probably in the minority.
Well, as I said "While we could have all sorts of exceptions and rearrangements and special cases, it would be an absolute maintenance nightmare".
But to be more specific, the fact that such names are in the *minority* doesn't make any difference - if short URLs of this form were to be considered "correct" and even "normal", as people are suggesting, they would have to *always* work. Otherwise, people would just be extremely confused when they typed one in which *didn't* work - if they were never redirected to the longer form (or, arguably, if they didn't notice it, as with the former setup) they would have no idea what was wrong with the address they'd typed.
In fact, without the check for "/", the redirect would also redirect "/wiki/article" to "/wiki/wiki/article", so it must be already in place..
No, because this isn't actually a rewrite rule (as Alfio quite reasonably guessed) but a 404 handler, for when people enter a URL that doesn't exist. So no exceptions of this kind are needed, because those URLs *do* exist (in as much as they are handled by the wiki scripts).
One could also say that only wiki articles should live under the en.wikipedia.org namespace and everything else should be somewhere else, like files.wikipedia.org or skins.wikipedia.org etc.
Yes, that would certainly be a possibilty - but note that each sub-domain has its own installation of the MediaWiki software, so unless we had sub-sub-domains, like skins.en.wikipedia.org (which would make administering DNS that much harder), this would probably require some pretty major changes to the code to use some "common" repository - including some way of handling the exceptions where things *need* to be different, etc.
What's more, this still wouldn't deal with the issue of URLs like .../w/index.php, which are inherently both project-dependent and part of the core code. What's more, these aren't just used within the software, but are extensively linked to externally, so any rearrangement would have to leave them working as expected. Like I say, any exception you make is going to be very confusing as soon as somebody tries it expecting the opposite behaviour.
Rowan Collins wrote:
One could also say that only wiki articles should live under the en.wikipedia.org namespace and everything else should be somewhere else, like files.wikipedia.org or skins.wikipedia.org etc.
Yes, that would certainly be a possibilty - but note that each sub-domain has its own installation of the MediaWiki software, so unless we had sub-sub-domains, like skins.en.wikipedia.org (which would make administering DNS that much harder), this would probably require some pretty major changes to the code to use some "common" repository - including some way of handling the exceptions where things *need* to be different, etc.
This part's actually not really true. We have a single installation of MediaWiki, and about a half-dozen copies of the base docroot which consist of a few identical symlinks and a couple different ones.
Hypothetically we could move some of the skin files etc to a subdomain... but we could never eliminate a few things things like robots.txt or the script itself, and we need to maintain the previous standard URLs as valid entry points. Exceptions aren't tenable for all-inclusive projects like ours, particularly not for the canonical URLs.
My preference has been to merge the language domains to get URLs like this:
http://wikipedia.org/en/Foobar http://wiktionary.org/la/imperium
and perhaps things like this: http://wikipedia.org/edit/fr/Nice http://wikipedia.org/history/fr/Nice?from=200411112117 http://wikipedia.org/revision/fr/1063501
But we haven't got round to that yet. I've added some preliminary support for 'action URLs' to the 1.5 code to allow prettifying the non-view actions (edit, history, etc).
But we cannot and will not have canonical or standard URLs like this, ever:
http://en.wiktionary.org/robots.txt <- is this a dictionary entry or the robots exclusion file??
-- brion vibber (brion @ pobox.com)
-----BEGIN PGP SIGNED MESSAGE-----
Moin,
On Thursday 03 March 2005 01:30, Brion Vibber wrote:
Rowan Collins wrote: But we cannot and will not have canonical or standard URLs like this, ever:
http://en.wiktionary.org/robots.txt <- is this a dictionary entry or the robots exclusion file??
if the file exists => file, otherwise entry. However, I can see what happens if the file does not exist, but the entry: the crawler would get the article, instead of a 404. So that settles it then. :o)
Best wishes,
Tels
- -- Signed on Thu Mar 3 18:15:53 2005 with key 0x93B84C15. Visit my photo gallery at http://bloodgate.com/photos/ PGP key on http://bloodgate.com/tels.asc or per email.
"Den wahren Wert dieser Software werden vermutlich nur Fach Läute und Firmen erkennen." -- "So isst es. Ein gewißer Standart muss schon gewart beiben!" -- Kabe (http://tinyurl.com/3kucx)
On Thu, 3 Mar 2005 18:17:28 +0100, Tels nospam-abuse@bloodgate.com wrote:
On Thursday 03 March 2005 01:30, Brion Vibber wrote:
But we cannot and will not have canonical or standard URLs like this, ever:
http://en.wiktionary.org/robots.txt <- is this a dictionary entry or the robots exclusion file??
if the file exists => file, otherwise entry.
Just to clarify why this doesn't solve the problem - if you have an exception to the short way, you have to teach people the long way. Hence, the short URL form is not standard and canonical, because it won't always work. Hence, you have to mainly use the long, canonical, form. And hence, you have a visible redirect like Brion has now implemented.
Rowan Collins wrote:
No. As Brion rightly points out, there are, and always will be, URLs for things other than wiki pages. If, for instance, http://en.wikipedia.org/skins was the URL of a wiki page called "skins", where would the files (CSS, images, etc) for displaying the skin be stored? And even more so, if http://en.wikipedia.org/w referred to the article [[w]], all the URLs of the form http://en.wikipedia.org/w/index.php?title=... would instantly break (in fact, they would point to pages called things like [[w/index.php?title=...]]).
In the long-term, why should there be such things? If something is not a wiki page, it shouldn't be using part of the public URL-space on en.wikipedia.org. Such things could be hosted at e.g. skins.wikipedia.org, or skins.wikipedia.org/en/ if they must be localized. There already is such a place for images (Wikimedia Commons).
-Mark
On Wed, 02 Mar 2005 13:55:51 -0500, Delirium delirium@hackish.org wrote:
In the long-term, why should there be such things? If something is not a wiki page, it shouldn't be using part of the public URL-space on en.wikipedia.org. Such things could be hosted at e.g. skins.wikipedia.org, or skins.wikipedia.org/en/ if they must be localized. There already is such a place for images (Wikimedia Commons).
Let me paste a URL from another tab of my browser: http://en.wikipedia.org/w/index.php?title=Talk:Googlepedia&action=edit Is that a wiki page? Yes. Is it of the form <server>/wiki/<pagename>? No. Is there any logic to farming it off to some other domain? No. It is, always has been, and always should be "using part of the public URL-space on en.wikipedia.org".
Or consider an address like http://en.wikipedia.org/w/index.php?title=Googlepedia&diff=10548376&... - the kind of address that people frequently bookmark, refer to in discussions, e-mails, external websites, etc. Hence, the kind of address that it would be extremely foolish to break by rearranging the URL-space.
Sorry if this comes across as rather rude, I'm just fed up of repeating the same argument again and again because people haven't understood it. Let me summarise:
* there are things in the URL-space of, e.g., en.wikipedia.org that are not wiki pages
* some of these, such as elements of the skins, are or should be only referenced internally, and would therefore be safe to move * some, however, such as the scripts in the /w/ directory are extensively referenced in all sorts of external contexts, and therefore *must* be retained with their current function * there is no guarantee what URLs the software will need in the future, and whether they will need to be externally referencable
* it would be possible to create a redirection system with exceptions for all the things which aren't wiki pages; wiki pages conflicting with those exceptions would then be inaccessible via that redirection system * in order for people to then access such pages, there would need to be a longer URL format that was not prone to these conflicts * people would need to know what those longer URLs were * the software would need to generate those longer URLs, because unlike a human it couldn't check and then go "oh, maybe I need the longer form" * therefore, the longer URLs would have to be considered, as they always have been, the "correct" and "normal" form * the 404 handler in its current state does exactly this, by redirecting people, but informing them that they are being redirected.
Like I say, I do apologise for getting frustrated about this; I do realise that if people misunderstand me, it's as likely my fault as theirs.
One final thought is that for particular things like the Foundation website, invisible redirects could be created *on a case-by-case basis* so that URLs like http://wikimediafoundation.org/fundraising could be given out as "official" URLs (or for cases where they already have been). The Foundation site is an exception, in the sense that to most users it is a static site on which they can look up information; only for a few people is it usable as a wiki (i.e. edittable). It therefore seems unnatural to have key URLs containning the word "wiki".
-----BEGIN PGP SIGNED MESSAGE-----
Moin,
On Wednesday 02 March 2005 20:37, Rowan Collins wrote:
On Wed, 02 Mar 2005 13:55:51 -0500, Delirium delirium@hackish.org
wrote:
In the long-term, why should there be such things? If something is not a wiki page, it shouldn't be using part of the public URL-space on en.wikipedia.org. Such things could be hosted at e.g. skins.wikipedia.org, or skins.wikipedia.org/en/ if they must be localized. There already is such a place for images (Wikimedia Commons).
Let me paste a URL from another tab of my browser: http://en.wikipedia.org/w/index.php?title=Talk:Googlepedia&action=edit
My apologies, I didn't know about /w/.
I still think it would made have (make?) sense to not put things except wiki-articles under XY.wikipedia.org, however, this is too late now to change.
Also, thank you explaining these matters. I know it must be frustrating to have explain things over and over again. Maybe a FAQ for these things could be created.
Again, thanx for your understanding and patience :)
Best wishes,
Tels
- -- Signed on Wed Mar 2 20:51:30 2005 with key 0x93B84C15. Visit my photo gallery at http://bloodgate.com/photos/ PGP key on http://bloodgate.com/tels.asc or per email.
This email violates U.S. patent #5,546,528 and EU patent EP0689133:
________________________________________ | Header | Body | Attachements | | |--------+ +----------------------| | | ~ ~
Brion Vibber schreef:
I've taken it out for the moment as a) like adding random domain names, it's a really bad idea to encourage a dependency on inherently unstable URLs, and b) there was an infinite looping problem.
-- brion vibber (brion @ pobox.com)
In the printed and spread English and Dutch leaflets about Wikipedia the url "http://wikimediafoundation.org/Fundraising" and "http://wikimediafoundation.org/giften" is used.
I would be usefull if the work again.
I'd like it back too, not having to type an extra five characters was nice.
wikitech-l@lists.wikimedia.org