Embedded malware in media

List overview All Threads
Download

newer

older

possible gsoc idea, comments?

de wikipedia dumps in progress

Kevin Day

12 Mar 2013 12 Mar '13

1:02 a.m.

We've once again been notified that our mirror of the Wikimedia images is "hosting malware". A quick check appears to mostly be more newly uploaded PDFs with one or more exploits in them, but there are also a few other media types that seem to be similarly damaged.

I'm personally okay with ignoring it, it's not hurting us any, but ideally I'd like to see things like this get removed. Many of the infected PDFs appear to be arabic language documents that would be of interest to people critical of their government, so the implications of what's going on here are probably bigger than just random viruses getting added to files.

I'm happy to scan everything again and post a list of things. I'm also willing to automate this if it would help (periodic scans and uploading a list of all questionable images to a wiki page somewhere?) Anyone have any suggestions on what to do here?

-- Kevin

Show replies by date

Kevin Day

12 Mar 12 Mar

1:08 a.m.

On Mar 11, 2013, at 1:02 PM, Kevin Day kevin@your.org wrote:

...

We've once again been notified that our mirror of the Wikimedia images is "hosting malware". A quick check appears to mostly be more newly uploaded PDFs with one or more exploits in them, but there are also a few other media types that seem to be similarly damaged.

I'm personally okay with ignoring it, it's not hurting us any, but ideally I'd like to see things like this get removed. Many of the infected PDFs appear to be arabic language documents that would be of interest to people critical of their government, so the implications of what's going on here are probably bigger than just random viruses getting added to files.

I'm happy to scan everything again and post a list of things. I'm also willing to automate this if it would help (periodic scans and uploading a list of all questionable images to a wiki page somewhere?) Anyone have any suggestions on what to do here?

Added info:

Previous thread discussing this problem:

http://lists.wikimedia.org/pipermail/xmldatadumps-l/2012-July/000565.html

Bug filed about malicious PDFs from last time:

https://bugzilla.wikimedia.org/show_bug.cgi?id=38113

Chris Steipp

2:10 a.m.

On Mon, Mar 11, 2013 at 11:02 AM, Kevin Day kevin@your.org wrote:

...

We've once again been notified that our mirror of the Wikimedia images is "hosting malware". A quick check appears to mostly be more newly uploaded PDFs with one or more exploits in them, but there are also a few other media types that seem to be similarly damaged.

I'm personally okay with ignoring it, it's not hurting us any, but ideally I'd like to see things like this get removed. Many of the infected PDFs appear to be arabic language documents that would be of interest to people critical of their government, so the implications of what's going on here are probably bigger than just random viruses getting added to files.

I'm happy to scan everything again and post a list of things. I'm also willing to automate this if it would help (periodic scans and uploading a list of all questionable images to a wiki page somewhere?) Anyone have any suggestions on what to do here?

Kevin, dealing with the current issue, the list you provided last time was helpful so that admins could go through and delete the files. If you're able to generate that again, I think it would help.

For the longer-term issue, the WMF is not currently scanning upload with a virus scanner, because of the performance and false positive rates. It would be great if we could get a bot to scan and flag files, so we can shorten the time to removing them.

Kevin Day

2:22 a.m.

On Mar 11, 2013, at 2:10 PM, Chris Steipp csteipp@wikimedia.org wrote:

...

On Mon, Mar 11, 2013 at 11:02 AM, Kevin Day kevin@your.org wrote:

...
We've once again been notified that our mirror of the Wikimedia images is "hosting malware". A quick check appears to mostly be more newly uploaded PDFs with one or more exploits in them, but there are also a few other media types that seem to be similarly damaged.

I'm personally okay with ignoring it, it's not hurting us any, but ideally I'd like to see things like this get removed. Many of the infected PDFs appear to be arabic language documents that would be of interest to people critical of their government, so the implications of what's going on here are probably bigger than just random viruses getting added to files.

I'm happy to scan everything again and post a list of things. I'm also willing to automate this if it would help (periodic scans and uploading a list of all questionable images to a wiki page somewhere?) Anyone have any suggestions on what to do here?

Kevin, dealing with the current issue, the list you provided last time was helpful so that admins could go through and delete the files. If you're able to generate that again, I think it would help.

I'll get started on that now.

...

For the longer-term issue, the WMF is not currently scanning upload with a virus scanner, because of the performance and false positive rates. It would be great if we could get a bot to scan and flag files, so we can shorten the time to removing them.

If this is something that you guys don't easily have the resources to do internally, I could probably come up with something for this that runs on our end. There is a bit of delay between something being uploaded and it reaching us (I'm talking with Ariel Glenn right now on determining what the latency is), but if you're happy with a rather slow turnaround, it wouldn't be hard for me to script what I'm doing. Periodic rescanning everything is probably better than just a scan on import - I'd be very surprised if the nature of the infections I'm seeing were known by virus scanners at the time they were uploaded.

-- Kevin

Platonides

5:48 a.m.

On 11/03/13 20:22, Kevin Day wrote:

...

...
For the longer-term issue, the WMF is not currently scanning upload with a virus scanner, because of the performance and false positive rates. It would be great if we could get a bot to scan and flag files, so we can shorten the time to removing them.

If this is something that you guys don't easily have the resources to do internally, I could probably come up with something for this that runs

on our

...

end. There is a bit of delay between something being uploaded and it

reaching

...

us (I'm talking with Ariel Glenn right now on determining what the

latency is),

...

but if you're happy with a rather slow turnaround, it wouldn't be hard

for me

...

to script what I'm doing. Periodic rescanning everything is probably

better

...

than just a scan on import - I'd be very surprised if the nature of the infections I'm seeing were known by virus scanners at the time they

were uploaded.

...

-- Kevin

I don't think it would really be a problem. I can run try to run something on labs. Ben installed a swift instance in labs 12 months ago, but it is probably better to download directly, as swift is too greedy for this.

Richard Farmbrough

22 Mar 22 Mar

3:08 a.m.

I don't see how having the scan down elsewhere impacts on false positives. As for performance, phooey. That's a red herring, simply have stand-alone scanners that copy what they want to scan and report back.

4296

Age (days ago)

4306

Last active (days ago)

xmldatadumps-l@lists.wikimedia.org

5 comments

4 participants

tags (0)

participants (4)

Chris Steipp
Kevin Day
Platonides
Richard Farmbrough