New subject: [Foundation-l] Letter to the community on Controversial Content

17 Oct 2011

Note: This foundation-l post is cross-posted to commons-l, since this discussion may be of
interest there as well.

...
  From: Tobias Oelgarte
&lt;tobias.oelgarte(a)googlemail.com&gt; 
...
  It is a in house made problem, as i explained at
brainstorming [1].
 To put it short: It is a self made problem, based on the fact that this 
 images got more attention then others. Thanks to failed deletion 
 requests they had many people caring about them. This results in more 
 exact descriptions and file naming then in average images. Thats what 
 search engines prefer; and now we have them at a top spot. Thanks for 
 caring so much about this images and not treating them like anything else. 

I don't think that is the case, actually. Brandon described how the search function
works here:

http://www.quora.com/Why-is-the-second-image-returned-on-Wikimedia-Commons-…

To take an example, the file 

http://commons.wikimedia.org/w/index.php?title=File:Golden_Shower.jpg&a… 

(a prominent search result in searches for "shower") has never had its name or
description changed since it was uploaded from Flickr. My impression is that refinement of
file names and descriptions following discussions has little to do with sexual or
pornography-related media appearing prominently in search listings. The material is simply
there, and the search function finds it, as it is designed to do.

...
  Andreas, you currently represent exactly that kind of
argumentation that  
...
  leads into anything, but not to a solution. I
described it already in 
 the post "Controversial Content vs Only-Image-Filter" [2], that single 
 examples don't represent the overall thematic. It also isn't an addition 
 to the discussion as an argument. It would be an argument if we would 
 know the effects that occur. We have to clear the question: 

It is hard to say how else to provide evidence of a problem, other than by giving multiple
(not single) examples of it.

You could also search for blond, blonde, red hair, strawberry, or peach ...

What is striking is the crass sexism of some of the filenames and image descriptions:
"blonde bombshell", "Blonde teenie sucking", "so, so sexy",
"These two had a blast showing off" etc.

http://commons.wikimedia.org/w/index.php?title=Special%3ASearch&search=…

One of the images shows a young woman in the bathroom, urinating: 

http://commons.wikimedia.org/wiki/File:Blonde_woman_urinating.jpg

Her face is fully shown, and the image, displayed in the Czech Wikipedia, carries no
personality rights warning, nor is there evidence that she has consented to or is even
aware of the upload.

And I am surprised how often images of porn actresses are found in search results, even
for searches like "Barbie". Commons has 917 files in Category:Unidentified porn
actresses alone. There is no corresponding Category:Unidentified porn actors (although
there is of course a wealth of categories and media for gay porn actors).

...
  * Is it a problem that the search function displays
sexual content? (A  
...
  search should find anything related, by definition.)

I think the search function works as designed, looking for matches in file names and
descriptions. 

...
  * Is sexual content is overrepresented by the search?

I don't think so. The search function simply shows what is there. However, the sexual
content that comes up for innocuous searches sometimes violates the principle of least
astonishment, and thus may turn some users off using, contributing to, or recommending
Commons as an educational resource.

...
  * If that is the case. Why is it that way?
 * Can we do something about it, without drastic changes, like 
 blocking/excluding categories? 

One thing that might help would be for the search function to privilege files that are
shown in top-level categories containing the search term: e.g. for "cucumber",
first display all files that are in category "cucumber", rather than those
contained in subcategories, like "sexual penetrative use of cucumbers",
regardless of the file name (which may not have the English word "cucumber" in
it).

A second step would be to make sure that sexual content is not housed in the top
categories, but in appropriately named subcategories. This is generally already
established practice. Doing both would reduce the problem somewhat, at least in cases
where there is a category that matches the search term.

Regards,
Andreas

[1] 
...

http://meta.wikimedia.org/w/index.php?title=Controversial_content%2FBrainst…
[2] 
http://lists.wikimedia.org/pipermail/foundation-l/2011-October/069699.html

Am 17.10.2011 02:56, schrieb Andreas Kolbe:
  Personality conflicts aside, we're noting
that non-sexual search terms in Commons can prominently return sexual images of varying
explicitness, from mild nudity to hardcore, and that this is different from entering a
sexual search term and finding that Google fails to filter some results.

 I posted some more Commons search terms where this happens on Meta; they include 

 Black, Caucasian, Asian; 

 Male, Female, Teenage, Woman, Man; 

 Vegetables; 

 Drawing, Drawing style; 

 Barbie, Doll; 

 Demonstration, Slideshow; 

 Drinking, Custard, Tan; 

 Hand, Forefinger, Backhand, Hair; 

 Bell tolling, Shower, Furniture, Crate, Scaffold; 

 Galipette – French for "somersault"; this leads to a collection of 1920s
pornographic films which are undoubtedly of significant historical interest, but are also
pretty much as explicit as any modern representative of the genre.

 Andreas 

Re: [Commons-l] [Foundation-l] Letter to the community on Controversial Content