Hello all.
The fact that SVG, MIDI and other formats are blocked is getting really annoying. People complain about it over and over, and it's a bad situation also regarding the fact that the GFDL calls for the "transparent source" of a document.
As I understand it those formats are blocked because MSIE interprets everything as HTML that *looks* like HTML. It was then stated that in order to circumvent this, a varifyer would have to be written for all formats. I do not understand why this is so, and I would like to suggest a simple solution:
* when a file is uploaded, run "file -bi" against that file and remember the output, which is (a pretty good guess of) the mime-type of the file. * if the mime type is "text/html", refuse the upload. * if the mime type is a forbidden format (exe, etc), refuse the upload.
That should be enough. If you want to be picky about the files type, also do the following:
* have a map of mime-types-to-file-extensions. Look up the mime-type returned by file in that table. If it mismatches the file extension, warn about it and refuse to upload. Skip the test if the mime-type is not in the table.
If we are concerned about viruses in general, why not run a virus scanner against every uploaded files? Uploads are not the frequent, CPU should be able to cope with that.
BTW: may I also suggest to convert the file-extensions to lowercase in the same step the " "-to-"_" conversion happens? That would be great...
Please excuse me if this was all a pile of rubbish based on a misunderstanding - just point it out. Furthermore, i'm willing to write a routine that does the above, or anything else neccessary, provided i do not have to dig deep into the Mediawiki-code. Just tell me the specs of the function, and i'll post it here.