On 03/07/12 18:47, Kevin Day wrote:
Even temporarily forgetting about the complexity of
scanning PDFs, there's a lot of weirdness in a lot of files that even ClamAV
doesn't find. For example: (replacing < and > with [ and ] so this doesn't
trigger anyone's mail spam filters)
strings images/wikipedia/commons/7/7c/Silvana_Suárez_7.jpg | tail -9
[!-- INICIO - PUBLICIDAD POP-UP UNDER --]
[IFRAME
SRC="http://www.ciudad.com.ar/ar/popunder/p_submit.asp?site=personales.ciudad.com.ar"
width=1 height=1][/IFRAME]
[SCRIPT LANGUAGE="JavaScript"]
//[!--
for (var i=1; i<15; i++){
setTimeout('self.focus();',i*30);
//--]
[/SCRIPT]
[!-- FIN - PUBLICIDAD POP-UP UNDER --]
This looks like the image was stored in a free hosting web server
configured to append that content to the served files... and not
filtering out for the images.
Then it got uploaded to commons.
There are dozens of jpeg files that are valid jpegs
that have
encrypted rar files appended to the end of the jpeg data. It
might be a worthwhile idea to take any uploaded jpg/png/gif/etc
and completely rewrite it before using it. Tools like
jpegoptim / pngcrush / etc are pretty good at taking "wild"
images and completely rewriting them to remove any oddities.
-- Kevin
Appended Rar files is one of the things my tool detected.
If you send me a list of the images I can go trying to kill them.
Modifying the original images would be a bad idea. It'd be better to
forbid uploading of such files (rars are hard to block, since you need
to scan the full file...).