On Jul 2, 2012, at 4:14 PM, Platonides wrote:
On Jul 1, 2012, at 10:13 PM, Hydriz Wikipedia wrote:
As far as I know, the chances are rather slim, because the MediaWiki software has a malware checker (I think).
Perhaps we shall see what outputs from the ClamAV checking, before we can know what is happening.
MediaWiki supports running ClamAV on upload, but WMF isn't running one. I used to run multiple checks on uploads to Wikimedia Commons, until the server where it ran had a disk failure. AFAIK, there's no extra check being done at all.
Even temporarily forgetting about the complexity of scanning PDFs, there's a lot of weirdness in a lot of files that even ClamAV doesn't find. For example: (replacing < and > with [ and ] so this doesn't trigger anyone's mail spam filters)
strings images/wikipedia/commons/7/7c/Silvana_Suárez_7.jpg | tail -9 [!-- INICIO - PUBLICIDAD POP-UP UNDER --] [IFRAME SRC="http://www.ciudad.com.ar/ar/popunder/p_submit.asp?site=personales.ciudad.com..." width=1 height=1][/IFRAME] [SCRIPT LANGUAGE="JavaScript"] //[!-- for (var i=1; i<15; i++){ setTimeout('self.focus();',i*30); //--] [/SCRIPT] [!-- FIN - PUBLICIDAD POP-UP UNDER --]
There are dozens of jpeg files that are valid jpegs that have encrypted rar files appended to the end of the jpeg data. It might be a worthwhile idea to take any uploaded jpg/png/gif/etc and completely rewrite it before using it. Tools like jpegoptim / pngcrush / etc are pretty good at taking "wild" images and completely rewriting them to remove any oddities.
-- Kevin