On 2/27/07, David Gerard dgerard@gmail.com wrote:
On 13/02/07, Tangotango tangotango@wikignome.net wrote:
I've uploaded my Wikimedia Commons image/media search tool, dubbed "Mayflower", to the toolserver; it's available at http:// tools.wikimedia.de/~tangotango/mayflower/. Any comments and/or suggestions would be most welcome.
Interesting!
I just did a search on "Canon camera". Um. Well, there are a few pictures of Canon cameras and lenses there. Mostly shots taken *with* a Canon camera. And one pic of [[:en:David Cameron]] ...
The search would be well equipped to handle this if only we didn't use categories in a rather brain dead manner.
In your "Canon camera" search most of the results just have the words "canon" and "camera" in random spots in the description text.
The search you want is:
http://tools.wikimedia.de/~tangotango/mayflower/search.php?q=Canon&ic=Ca...
But because of the obsession with sub-categorization on commons the endless number of sub-categories means you miss most of the results. :(
If we used categories well, you could depend on there being a category "cameras" and a category "canon products" on the pictures you want. Mayflower will gladly search for the intersection of these two taggings.
I'm sure someone will suggest having the search walk the category hierarchy, but not only is this computationally expensive, it will result in an incorrect result: [[Category:Photos_taken_with_Canon]] is a child of [[Canon_cameras]], and I wouldn't be surprised to find that some of the "Taken with Canon foo5" are children of "Canon foo5".
I don't think that Photos_taken_with_Canon as a child of Canon_cameras is actually an unreasonable situation when you think of the category tree as a structure designed to help humans find categories. ... Nor do I think that using the category tree as an exclusive human aid is a problem since due to their being too many semantic link types, the drift will always make such simple relationships error prone.
If we ever want it to be as easy to find images on commons as it is on commercial stock photography sites like Getty, we need to stop pretending that all categories are best used as tools for manual browsing.