On 2/27/07, David Gerard <dgerard(a)gmail.com> wrote:
On 13/02/07, Tangotango
<tangotango(a)wikignome.net> wrote:
I've uploaded my Wikimedia Commons
image/media search tool, dubbed
"Mayflower", to the toolserver; it's available at http://
tools.wikimedia.de/~tangotango/mayflower/.
Any comments and/or suggestions would be most welcome.
Interesting!
I just did a search on "Canon camera". Um. Well, there are a few
pictures of Canon cameras and lenses there. Mostly shots taken *with*
a Canon camera. And one pic of [[:en:David Cameron]] ...
The search would be well equipped to handle this if only we didn't use
categories in a rather brain dead manner.
In your "Canon camera" search most of the results just have the words
"canon" and "camera" in random spots in the description text.
The search you want is:
http://tools.wikimedia.de/~tangotango/mayflower/search.php?q=Canon&ic=C…
But because of the obsession with sub-categorization on commons the
endless number of sub-categories means you miss most of the results.
:(
If we used categories well, you could depend on there being a category
"cameras" and a category "canon products" on the pictures you want.
Mayflower will gladly search for the intersection of these two
taggings.
I'm sure someone will suggest having the search walk the category
hierarchy, but not only is this computationally expensive, it will
result in an incorrect result: [[Category:Photos_taken_with_Canon]] is
a child of [[Canon_cameras]], and I wouldn't be surprised to find that
some of the "Taken with Canon foo5" are children of "Canon foo5".
I don't think that Photos_taken_with_Canon as a child of Canon_cameras
is actually an unreasonable situation when you think of the category
tree as a structure designed to help humans find categories. ... Nor
do I think that using the category tree as an exclusive human aid is a
problem since due to their being too many semantic link types, the
drift will always make such simple relationships error prone.
If we ever want it to be as easy to find images on commons as it is on
commercial stock photography sites like Getty, we need to stop
pretending that all categories are best used as tools for manual
browsing.