There's been some comments on some old tasks such as T27707
<https://phabricator.wikimedia.org/T27707> about problems with uploading
files that include text metadata that looks like HTML elements.
Years ago, we added security checks for IE 5/6/7 to work around IE's mime
type sniffing: if you went to view a .png file directly in IE (as opposed
to in an <img>) the browser would check the first few bytes of the file to
detect its type, overriding the HTTP Content-Type header. HTML would be
detected with a higher priority than the actual image formats, making it
possible to create an actual .png image which when viewed as an image
looked like an image, but when viewed as a web page was interpreted as
(This was defense in depth in addition to hosting files on a separate
domain; among other things, we sometimes serve files out from the main
domain when dealing with archived (deleted) versions, and third-party
installs are not guaranteed to have a second domain.)
Browsers have moved on, but the code remains and it trips up legitimate
files containing links in metadata, or sometimes just random compressed
data that looks like an element!
I've done a quick research check on feasibility:
* IE 6 and earlier can no longer access Wikimedia sites due to lack of SNI
and TLS 1.0 or later
* IE 7 on Windows XP can no longer access Wikimedia sites due to lack of SNI
* IE 7 on Windows Vista **can** access Wikimedia sites.
* IE 8 and higher support X-Content-Options: nosniff to disable sniffing,
which we already use on all MediaWiki requests.
At some point Microsoft dropped the sniffing, but I'm not sure if it was a
later IE version or an Edge version. No other browsers in reasonably
current versions seem to have this problem.
So the only remaining browser version that might be affected is IE 7 on
Windows Vista, which supports SNI and TLS 1.0. It might or might not still
work once we drop TLS 1.0 some time in the future. (Per our TLS dashboard
1.2% of our connections still use TLS 1.0, but this isn't broken down
between logged-in-user views and anon views.)
* Should we drop the anti-sniff checks on upload?
* If we do, should we forbid logins with IE 7, or something else to protect
the occasional IE 7 logged-in user from a hypothetical targeted drive-by
attack? (Is it actually worth doing work and testing it for this?)
* Should we add X-Content-Options: nosniff on files served from