Le 11/04/12 09:27, Kim Eik a écrit :
I have created a patch for the gallery tag and have
been given the
following review.
https://gerrit.wikimedia.org/r/4609
* JavaScript injection: you can inject javascript: URIs which execute
code when clicked
* plain links ("link=Firefox") are taken as relative URLs which will
randomly work or not work depending on where they're viewed from
<snip>
What would be the recommended way of stripping away
javascript from
uris? Are there any shared functions which do exactly this?
And how would i solve the plain links problem? do a regex check for an
absolute uri? e.g
http://example.org/foo/bar?
I have added some inline comment on includes/parser/Parser.php patch #7
https://gerrit.wikimedia.org/r/#patch,unified,4609,7,includes/parser/Parser…
Copy pasting it here for later reference:
----------------------------------------------------------------------
const EXT_URL_REGEX =
'/^(([\w]+:)?\/\/)?(([\d\w]|%[a-fA-f\d]{2,2})+(:([\d\w]|%[a-fA-f\d]{2,2})+
)?@)?([\d\w][-\d\w]{0,253}[\d\w]\.)+[\w]{2,4}(:[\d]+)?(\/([-+_~.\d\w]|%[a-fA-f\d]{2,2})*)*(\?(&?
([-+_~.\d\w]|%[a-fA-f\d]{2,2})=?)*)?(#([-+_~.\d\w]|%[a-fA-f\d]{2,2})*)?$/';
We would need a parser guru to find out a similar and simpler regex.
Anyway you will find hints in includes/parser/Parser.php
wfUrlProtocols() gives a regex of protocols allowed in URLs.
Parser::EXT_LINK_URL_CLASS is a regex of character allowed and of those
disallowed. That makes sure you find out the end of the URL with various
funny case such as 0+3000 which is an ideographic space and is used on
Chinese wikis.
Since what you are trying to achieve is really similar to the 'link'
parameter handling in parser::makeImage() . Some relevant code:
case 'link':
$chars = self::EXT_LINK_URL_CLASS;
$prots = $this->mUrlProtocols; // which is wfUrlProtocols()
if ( preg_match( "/^($prots)$chars+$/u", $value, $m ) ) {
$paramName = 'link-url';
$this->mOutput->addExternalLink(
$value );
if (
$this->mOptions->getExternalLinkTarget() ) {
$params[$type]['link-target'] =
$this->mOptions->getExternalLinkTarget();
}
Well you get the idea :-)
----------------------------------------------------------------------
Reading my text again I should have reread myself before saving that
comment. Anyway, I am pretty sure we can factor out the code handling
'link' for image and what you are trying to do.
--
Antoine "hashar" Musso