On 8/18/06, Jay R. Ashworth jra@baylink.com wrote:
I very strongly suspect that no one who hasn't lived intimately with the parser code (that's, what, 4 or 5 people? :-) could predict what those things would do; they all seem implementation defined to me.
Or almost all...
They do illustrate why making a late pass to hotlink URLs might not be a safe approach, though.
(oops, I should have changed the subject earlier)
Depends what you mean by a "late pass". Any "early pass" is wrong - basically, a URL should only match if absolutely nothing else does - no normal links, for instance. But what kind of "late pass" - is there a parse tree that you can check to see whether the token has been matched against anything fancier than plain text?
The most interesting revelation of the above tests, for those who missed it, is that it *is* possible to link to a page named after a URL, but [[http://foo.com]] won't do it (that generates a, what was it, "direct link"). However, [[ http://foo.com]] works, although the page ends up being called "Http://foo.com". It's not completely inconceivable to me that one day we might want to write an article about a URL, like if some postmodern band names an album "http://stupid.com" or something.
Steve
On 8/18/06, Steve Bennett stevage@gmail.com wrote:
The most interesting revelation of the above tests, for those who missed it, is that it *is* possible to link to a page named after a URL, but [[http://foo.com]] won't do it (that generates a, what was it, "direct link"). However, [[ http://foo.com]] works, although the page ends up being called "Http://foo.com". It's not completely inconceivable to me that one day we might want to write an article about a URL, like if some postmodern band names an album "http://stupid.com" or something.
True (album names, ugh). Note the following in Parser.php:
# Don't allow internal links to pages containing # PROTO: where PROTO is a valid URL protocol; these # should be external links. if (preg_match('/^(\b(?:' . wfUrlProtocols() . '))/', $m[1])) { $s .= $prefix . '[[' . $line ; continue; }
Any reason that we explicitly ban pages from having titles that look like URLs?
On 8/18/06, Simetrical Simetrical+wikitech@gmail.com wrote:
True (album names, ugh). Note the following in Parser.php:
# Don't allow internal links to pages containing # PROTO: where PROTO is a valid URL protocol; these # should be external links. if (preg_match('/^(\b(?:' . wfUrlProtocols() . '))/', $m[1])) { $s .= $prefix . '[[' . $line ; continue; }
Any reason that we explicitly ban pages from having titles that look like URLs?
Hmm, can't vouch for that, but I just tried this at another wiki, and the result was quite strange. I created a link to [[ http://test.com]]. I clicked the redlink and saved some text. At that point, the page was apparently renamed http:/test.com (note the missing slash), and I was told that that page was empty. Returning to the original page, my redlink is now blue, and definitely points to http://test.com However, clicking on it takes me to http:/test.com, which doesn't exist.
Oh, now here's fun. I repeated the experiment with [[ mailto:foo]]. This time, the link behaved as expected, and I was able to save text to the page "mailto:foo". I return to the original page, and click the blue link. Guess what happens? My mail editor opens...
Strangely, neither {{http://test.com%7D%7D nor {{mailto:foo}} works, but that could be a namespace issue, where it's treating http: and mailto: as the namespace, and for some reason deciding to tack template: on the front, or something.
Anyway, anyone believe me yet that these magic words are a bad idea? :)
Steve
On Fri, Aug 18, 2006 at 07:33:24PM +0200, Steve Bennett wrote:
On 8/18/06, Jay R. Ashworth jra@baylink.com wrote:
I very strongly suspect that no one who hasn't lived intimately with the parser code (that's, what, 4 or 5 people? :-) could predict what those things would do; they all seem implementation defined to me.
Or almost all...
They do illustrate why making a late pass to hotlink URLs might not be a safe approach, though.
(oops, I should have changed the subject earlier)
Depends what you mean by a "late pass". Any "early pass" is wrong - basically, a URL should only match if absolutely nothing else does - no normal links, for instance. But what kind of "late pass" - is there a parse tree that you can check to see whether the token has been matched against anything fancier than plain text?
No, my suggestion had been to do a final pass that handled that and several other things (like MAGIC words)... but on reflection, I think you *don't* want that processing applied to things which have already been parser-expanded, so I guess you have to let the parser handle them in-line as well.
The most interesting revelation of the above tests, for those who missed it, is that it *is* possible to link to a page named after a URL, but [[http://foo.com]] won't do it (that generates a, what was it, "direct link"). However, [[ http://foo.com]] works, although the page ends up being called "Http://foo.com". It's not completely inconceivable to me that one day we might want to write an article about a URL, like if some postmodern band names an album "http://stupid.com" or something.
Hee.
Cheers, -- jr 'http://www.washme.com/soap.html%27a
wikitech-l@lists.wikimedia.org