Hi all, it is perhaps unfortunate that my first post to the list is a bit of a rant, but I'm very concerned about the abuse of the file upload feature and so far I haven't seen any responses to the discussion on meta from people with the power to do something about it.
It seems to me from following the upload queue that it is being deliberately abused to store distinctly non-encyclopedia and copyright-infringing material on a disconcertingly large scale.
I would assume that this would be of direct financial concern to Bomis, both for the potential of large bandwidth costs and the potential for being sued, and to the rest of the project because of the risk of being shut down by a lawsuit.
Are we just going to attempt to keep this under control by the vigilance of a few dedicated Wikipedians, or might, in this case, some technical measures to discourage abuse be appropriate?
For instance, it might be feasible to restrict uploads to registered users, limiting the size and number of files that a user can upload in a day, perhaps even restricting the types of files that can be uploaded by checking with the "file" file type checking utility - that would at least prevent the uploading of executables).
I would be prepared to help implement some of the above. I'm a decent programmer, though I don't have any experience with PHP so it'd take me some time to get up to speed.
Obviously, I think technical measures to slow things is justified in this case, otherwise I suspect too much time will be wasted weeding out rather noxious material.
On Mon, 25 Feb 2002, Robert Graham Merkel wrote:
It seems to me from following the upload queue that it is being deliberately abused to store distinctly non-encyclopedia and copyright-infringing material on a disconcertingly large scale.
I agree 100% that this is a problem. Last night I deleted several dozen files that someone had overwritten as being (obviously) inappropriate for Wikipedia articles. I was a little concerned from the beginning that having virtually no restrictions on the upload function would have this effect, so it's not too surprising that this is happening.
I would assume that this would be of direct financial concern to Bomis, both for the potential of large bandwidth costs and the potential for being sued, and to the rest of the project because of the risk of being shut down by a lawsuit.
Although not a huge financial concern, but I'd agree the risk is there...
Are we just going to attempt to keep this under control by the vigilance of a few dedicated Wikipedians, or might, in this case, some technical measures to discourage abuse be appropriate?
For instance, it might be feasible to restrict uploads to registered users, limiting the size and number of files that a user can upload in a day, perhaps even restricting the types of files that can be uploaded by checking with the "file" file type checking utility - that would at least prevent the uploading of executables).
I like all of these ideas. Certainly in any case only registered users should be able to upload files. This seems a reasonable thing to ask, given the potential for abuse that the file uploader represents. Right now, I can't even spot a miscreant's IP address by looking at the log.
I would be prepared to help implement some of the above. I'm a decent programmer, though I don't have any experience with PHP so it'd take me some time to get up to speed.
Obviously, I think technical measures to slow things is justified in this case, otherwise I suspect too much time will be wasted weeding out rather noxious material.
I don't think that the technical measures you propose will slow very much at all down. The only person who might upload beyond a given size limit would be Magnus. :-) I imagine that there is hardly anyone who (1) refuses to sign in but who (2) wants to upload a useful file (e.g., a public domain photo for a biography).
I don't know if the specific proposals you make are the best, but I agree that something along these lines should be done.
Larry
I already changed the software so it only accepts uploads from users who are logged in. I did this two days or so ago. It might help if Jimbo would actually install the latest version ;)
Magnus
Larry Sanger wrote:
I agree 100% that this is a problem. Last night I deleted several dozen files that someone had overwritten as being (obviously) inappropriate for Wikipedia articles. I was a little concerned from the beginning that having virtually no restrictions on the upload function would have this effect, so it's not too surprising that this is happening.
We just never gave the programmers a good set of requirements for the uploader.
On Mon, 25 Feb 2002, Lars Aronsson wrote:
Larry Sanger wrote:
I agree 100% that this is a problem. Last night I deleted several dozen files that someone had overwritten as being (obviously) inappropriate for Wikipedia articles. I was a little concerned from the beginning that having virtually no restrictions on the upload function would have this effect, so it's not too surprising that this is happening.
From a general, Wiki-philosophical-social aspect, it is interesting
that the upload function gets abused, while general Wiki pages do not.
Actually, there's a good reason for it: the images aren't obviously linked to anything in any article. This is an ABSOLUTELY essential piece of information to have: what articles *use* the image in question? If no article uses an image after 24 hours, perhaps we should delete the image (or put it in a queue to be deleted by a human). So, the point is, without a context, unless some image is at face value obviously worthless to any Wikipedia article (e.g., porn advertisements), it's difficult for us to tell whether an image really is appropriate for the 'pedia. It would even make it easier for us to determine whether an image is copyrighted.
One way around this would be to attach images to unique articles, so that the uploading of an image would be logged in a particular article's history. I don't know if I like this suggestion, though, I'm just throwing it out there for your consideration.
Here's another thing we need in that upload form. We should ask people to choose: (1) I have created this image and release it under the GNU FDL (or contribute it to Wikipedia); (2) I personally certify that this image is public domain (if checked, add a text box requiring that a source be given--a URL or else a book title, say); (3) other? If none are checked, then the uploader wouldn't accept the article.
Under some schemes we might want (1) to require that the uploader identify which article the image is going to be used in, and (2) to check that the image title is linked to from that article. But (1) might be done automatically, I guess...
Doing these things would remove a fair bit of the abuse. It would certainly make it a lot easier for the community to act as a check on the abuse.
Perhaps the uploads should be visible in the RecentChanges list?
They already are, sort of--but each one individually should be, which isn't the case now.
Perhaps there should be a "view other versions" for each upload?
Maybe--would prevent people from uploading porn in place of legit images, for instance.
Perhaps a Wikipage in the upload: namespace for each uploaded object?
Maybe...?
Larry
On lun, 2002-02-25 at 11:16, Larry Sanger wrote:
On Mon, 25 Feb 2002, Lars Aronsson wrote:
From a general, Wiki-philosophical-social aspect, it is interesting
that the upload function gets abused, while general Wiki pages do not.
Actually, there's a good reason for it: the images aren't obviously linked to anything in any article. This is an ABSOLUTELY essential piece of information to have: what articles *use* the image in question? If no article uses an image after 24 hours, perhaps we should delete the image (or put it in a queue to be deleted by a human).
At the moment, though, the non-English wikipedias don't have their own upload capabilities. Images that are used on the other wikis thus tend to end up uploaded to www.wikipedia.com or meta.wikipedia.com without necessarily being used where they were uploaded (especially diagrams and maps with language-specific names, descriptions, etc).
So please, don't delete my Esperantized maps. :)
So, the point is, without a context, unless some image is at face value obviously worthless to any Wikipedia article (e.g., porn advertisements), it's difficult for us to tell whether an image really is appropriate for the 'pedia. It would even make it easier for us to determine whether an image is copyrighted.
One way around this would be to attach images to unique articles, so that the uploading of an image would be logged in a particular article's history. I don't know if I like this suggestion, though, I'm just throwing it out there for your consideration.
I'm not quite sure how to go about doing that.
What could be done though that may be useful, is to add a link to a "pages that link to this file" function next to each name in the uploaded file list. A start, at least, though it doesn't cover the multi-wiki problem.
-- brion vibber (brion @ pobox.com)
wikipedia-l@lists.wikimedia.org