On Mon, Jan 28, 2019 at 10:58 PM Kunal Mehta <legoktm@member.fsf.org> wrote:
Tim wrote a nice blog post about how he reverse-engineered this:
<https://tstarling.com/blog/2008/12/secure-web-uploads/>.

I don't have any comments on whether it's still needed, but if it's
determined that MediaWiki can drop the checks, I'd like to see it
turned into a PHP library...mostly because it's some neat code.

Heck yes. :)

My provisional patch is still keeping the code, but using it more conservatively:
https://gerrit.wikimedia.org/r/c/mediawiki/core/+/487527

The exact IE-based heuristics are still checked but will trip only on files matching the exact conditions that would affect old IE versions. Most uploaded JPEG or PNG files with HTML-ish links in EXIF metadata should not trigger it, as the tag strings won't appear in the first 256 bytes of the file.

The less-exact heuristics (held over from before Tim's addition of the reverse-engineered exact checks) are now less overbearing, and should no longer trigger on <a href, <img, <pre, <table, or <title tags. This also obsoletes the $wgAllowTitlesInSVG setting, which will now be always true.


Feel free to help with review and testing -- especially welcome if folks have samples of JPEG, PDF, DjVu, or other files that were blocked before but should work so we can double-check. Thanks all!

-- brion