I've found a bit of an issue with our external image embedding
whitelisting functionality.
This isn't exactly a hole in the code itself, but in the fact that in
practice it seams just about everyone uses the whitelist incorrectly and
ends up opening up holes in their wiki allowing the whitelist to be
bypassed.
I'll start with
MW.org for an example:
https://www.mediawiki.org/wiki/MediaWiki:External_image_whitelist
This image whitelist is fine, it's properly anchored with an explicit
protocol and an initial ^, and it's not using excessive wildcards, there's
nothing wrong with it.
However when I do a Google search and try to find some of the top wikis
using the image whitelist functionality I see this:
http://rbose.org/wiki/MediaWiki:External_image_whitelist
http://mbmodwiki.ollclan.eu/MediaWiki:External_image_whitelist
http://wiki.vnations.net/index.php/MediaWiki:External_image_whitelist
http://stelio.net/geeki/MediaWiki:External_image_whitelist
http://community.wikia.com/wiki/MediaWiki:External_image_whitelist
Basically EVERYONE except the smart people running Wikimedia sites use the
image whitelist incorrectly. There are rules using .* in some but more
importantly NO ONE anchors their whitelist rules (they don't even bother
including the protocol in some cases so we can't even use an implicit
anchor to the regexps).
This means that the whitelists can be trivially bypassed:
http://community.wikia.com/wiki/User:Dantman/Whitelist_hole
In this example Wikia has a `wikia\.com` regexp line in their image
whitelist.
By using something like this the image whitelist is bypassed:
http://imgs.xkcd.com/comics/security_holes.png?wikia.com&image.png
The "?wikia.com" inside of the query triggers the whitelisting allowing
the image to be embedded, and the trailing &image.png makes sure that the
url still matches the internal image url embed regexp.
By adding a query like this (it doesn't even necessarily need to be a
query, I haven't tested but the fragment might be usable, and even if not
it's liable that you could use the path portion of the url if you had a
server setup to serve images for certain weird urls) you can embed
basically any url you want into the wiki since the query portion of the
url is ignored by webservers serving images.
And to be clear I don't believe that patterns like
`http://upload\.wikimedia\.org/` and `^http://(.*?\.)?wordpress\.com/`
aren't safe. I believe that the special characters in the later parts of
the url won't affect it and you can still get it to work. And ^ anchoring
won't work when using .* style wildcards because you can craft a url such
as
http://my.malicious-website.com/path/to/my/evil/image.png?.wordpress.com&am…
which would match that latter regexp.
--
~Daniel Friesen (Dantman, Nadir-Seen-Fire) [
http://daniel.friesen.name]