Unfortunately we currently have zero developers working on search (as
far as I know). There are several more significant search bugs that are
also not going to be fixed any time soon. Another issue is that our
search engine is Java while the rest of MediaWiki is PHP. This makes
sense for performance reasons, but makes the pool of potential
developers who are able and willing to work on it much smaller. In other
words, this might get fixed in a few years, but I wouldn't hold my
breathe. In the meantime, it would be good to follow Sarah's lead and
proactively curate the content we have so that there is less potential
for astonishment in our search results.
Ryan Kaldari
On 10/13/11 5:37 PM, Andreas Kolbe wrote:
John,
*From:* John Vandenberg <jayvdb(a)gmail.com>
(Searching for "levee" in Commons
brings up an image of a
naked Suicide Girl called Levee in third place.)
Its a thumbnail for !@#$ sake, and anyone who finds that image
offensive should turn off their internet connection.
It's a perfectly nice image, but does it answer the user's need? In
most cases probably not. If I google levee, I see levees, not nude girls:
http://www.google.co.uk/search?gcx=c&q=levee&um=1&ie=UTF-8&…
<http://www.google.co.uk/search?gcx=c&q=levee&um=1&ie=UTF-8&hl=en&tbm=isch&source=og&sa=N&tab=wi&biw=1041&bih=638>
If I want to google for pictures of Levee, I google for "Levee Suicide
Girls", and there she is:
http://www.google.co.uk/search?gcx=c&q=levee&um=1&ie=UTF-8&…
<http://www.google.co.uk/search?gcx=c&q=levee&um=1&ie=UTF-8&hl=en&tbm=isch&source=og&sa=N&tab=wi&biw=1041&bih=638#um=1&hl=en&tbm=isch&sa=1&q=levee+suicide+girl&pbx=1&oq=levee+suicide+girl&aq=f&aqi=&aql=&gs_sm=e&gs_upl=127182l129981l0l130379l15l15l0l11l0l0l291l930l0.1.3l4l0&bav=on.2,or.r_gc.r_pw.,cf.osb&fp=120e52a58330422e&biw=1041&bih=638>
I guess Commons should give more weight to categories, and less weight
to file names. So when I google cucumber, it should show me images in
the cucumber category first of all, and not images that happen to have
cucumber in the title.
Brandon, is there something developers could do in this regard?
I am sure you'll be appalled that libraries include nude pictures in
their search results, often when searching for something else.
http://trove.nla.gov.au/picture/result?q=contemporary+north+america+20th+ce…
fix the metadata.
create a gallery page.
create a category and populate it.
etc
p.s. abstract art offends me. Can we please remove media related to
John Levee's from the Commons search results for the term 'Levee'. ;-)
We should be under no illusion that we can find
all search terms
whose
results violate the principle of least surprise,
presenting adult
images for
everyday search terms.
New such situations arise on a daily basis, each time someone
uploads an
explicit file that has a plausible search term in
its name and
description (try searching Commons for "eating", and then search for
"drinking"; or try finding images of Prince Albert).
The ordering of the search results isnt ideal. Have you raised a bug?
The thing is, John, it's not a bug. How is it a bug? The image is
called "Drinking urine" or whatever, and so it's a valid search result
for "drinking". No doubt, a bunch of people would argue that it would
be non-neutral to exclude it from the search results for drinking,
because Wikipedia is not censored, and we don't care if people are
unhappy with our service, because that would be non-neutral. ;)
<Imagine rant here.>
It puts too much weight on the filename, which isnt good because
recommend against rename, so the current search results are gamable by
the uploader.
We should simply offer safe search, like Google
does.
Google provides safe search. They need to convert 'the internet' into
a search results page that their customer wants to see, and the
Internet has a whole lot of stuff that 99% of the world never wants to
see.
Wikipedia provides encyclopedic information.
Commons provides a depository of media, and if you search for keywords
in the metadata you'll see thumbnails of the matching media.
I find Google safe search seriously useful, because it gives me a
choice, and enables me to tailor my search to my requirements. If I
want to see porn, I can see porn. If I'm looking for something else, I
can prevent my search being flooded with porn.
If I am a researcher looking for images of Prince Albert on Commons, I
would appreciate not being forced to wade through dozens of images of
penises with rings in them to find the image I'm looking for.
http://commons.wikimedia.org/w/index.php?title=Special:Search&redirs=1&…
<http://commons.wikimedia.org/w/index.php?title=Special:Search&redirs=1&ns0=1&ns6=1&ns9=1&ns12=1&ns14=1&ns100=1&ns106=1&search=Prince+albert&limit=500&offset=0>
We will not attract a more mature audience until we get our act together.
Andreas
_______________________________________________
Gendergap mailing list
Gendergap(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/gendergap