Hi, this is a follow-up to my posts regarding "The Ideal Wiki Software" on foundation-l, from late January.
I've uploaded my Wikimedia Commons image/media search tool, dubbed "Mayflower", to the toolserver; it's available at http:// tools.wikimedia.de/~tangotango/mayflower/.
Just to recap, this tool allows full-text searching of the Commons database, returning a gallery-based results page, much like Google Images and similar services. The main goal was to make a user- friendly interface, so that even non-Wikimedians can take advantage of it.
As I said back in January, I'm still interested in Brianna Laugher's idea of making a gallery-based, full-text search feature available as a MediaWiki extension, so that it can take advantage of the existing MediaWiki search index and the stability of the main servers.
Any comments and/or suggestions would be most welcome.
(Sorry if this looks like an advertisement; I don't usually advertise new tools, but I thought the foundation-l and commons-l communities would be interested.)
Cheers,
Tangotango
On 2/13/07, Tangotango tangotango@wikignome.net wrote:
Just to recap, this tool allows full-text searching of the Commons database, returning a gallery-based results page, much like Google Images and similar services. The main goal was to make a user- friendly interface, so that even non-Wikimedians can take advantage of it.
Fantastic. It would probably be useful to include the image page wikitext in the result.. even if it lowered the density of the results.
"Copyright (c) 2006-2007 Tangotango. Your comments and suggestions are welcome."
should have a "Images and other content are copyright their respective owners" or the like. :)
Hi!
Hi, this is a follow-up to my posts regarding "The Ideal Wiki Software" on foundation-l, from late January.
I've uploaded my Wikimedia Commons image/media search tool, dubbed "Mayflower", to the toolserver; it's available at http:// tools.wikimedia.de/~tangotango/mayflower/.
Just to recap, this tool allows full-text searching of the Commons database, returning a gallery-based results page, much like Google Images and similar services. The main goal was to make a user- friendly interface, so that even non-Wikimedians can take advantage of it.
As I said back in January, I'm still interested in Brianna Laugher's idea of making a gallery-based, full-text search feature available as a MediaWiki extension, so that it can take advantage of the existing MediaWiki search index and the stability of the main servers.
Any comments and/or suggestions would be most welcome.
(Sorry if this looks like an advertisement; I don't usually advertise new tools, but I thought the foundation-l and commons-l communities would be interested.)
Cheers,
Tangotango
Great tool! Thank you for your work!
I found minor problem. I tried to search for Rakaw/Ракаў/Раков and it find images only for query Rakaw. Images contain descriptions with all variants.
May be tool searches only withing references to page titles?
With best regards, Eugene.
2007/2/14, Eugene Zelenko eugene.zelenko@gmail.com:
Great tool! Thank you for your work!
I found minor problem. I tried to search for Rakaw/Ракаў/Раков and it find images only for query Rakaw. Images contain descriptions with all variants.
May be tool searches only withing references to page titles?
No, this is another problem, which I have already noted on [[meta:User Talk:Tangotango/Mayflower]]: The tool does not handle non-ascii characters, but instead acts as if they were spaces.
On 13/02/07, Tangotango tangotango@wikignome.net wrote:
I've uploaded my Wikimedia Commons image/media search tool, dubbed "Mayflower", to the toolserver; it's available at http:// tools.wikimedia.de/~tangotango/mayflower/. Any comments and/or suggestions would be most welcome.
Interesting!
I just did a search on "Canon camera". Um. Well, there are a few pictures of Canon cameras and lenses there. Mostly shots taken *with* a Canon camera. And one pic of [[:en:David Cameron]] ...
- d.
On 2/27/07, David Gerard dgerard@gmail.com wrote:
On 13/02/07, Tangotango tangotango@wikignome.net wrote:
I've uploaded my Wikimedia Commons image/media search tool, dubbed "Mayflower", to the toolserver; it's available at http:// tools.wikimedia.de/~tangotango/mayflower/. Any comments and/or suggestions would be most welcome.
Interesting!
I just did a search on "Canon camera". Um. Well, there are a few pictures of Canon cameras and lenses there. Mostly shots taken *with* a Canon camera. And one pic of [[:en:David Cameron]] ...
The search would be well equipped to handle this if only we didn't use categories in a rather brain dead manner.
In your "Canon camera" search most of the results just have the words "canon" and "camera" in random spots in the description text.
The search you want is:
http://tools.wikimedia.de/~tangotango/mayflower/search.php?q=Canon&ic=Ca...
But because of the obsession with sub-categorization on commons the endless number of sub-categories means you miss most of the results. :(
If we used categories well, you could depend on there being a category "cameras" and a category "canon products" on the pictures you want. Mayflower will gladly search for the intersection of these two taggings.
I'm sure someone will suggest having the search walk the category hierarchy, but not only is this computationally expensive, it will result in an incorrect result: [[Category:Photos_taken_with_Canon]] is a child of [[Canon_cameras]], and I wouldn't be surprised to find that some of the "Taken with Canon foo5" are children of "Canon foo5".
I don't think that Photos_taken_with_Canon as a child of Canon_cameras is actually an unreasonable situation when you think of the category tree as a structure designed to help humans find categories. ... Nor do I think that using the category tree as an exclusive human aid is a problem since due to their being too many semantic link types, the drift will always make such simple relationships error prone.
If we ever want it to be as easy to find images on commons as it is on commercial stock photography sites like Getty, we need to stop pretending that all categories are best used as tools for manual browsing.
On 27/02/07, Gregory Maxwell gmaxwell@gmail.com wrote:
If we ever want it to be as easy to find images on commons as it is on commercial stock photography sites like Getty, we need to stop pretending that all categories are best used as tools for manual browsing.
If categories in MediaWiki can be turned into tags, and you can then do interesting tag queries (and you can be sure the system will be tested to its limit the moment it's possible), then quite a lot of the problems with categories on Commons go away without breaking the current category system on other wikis. Are the efforts to this end progressing at all?
- d.
On 2/27/07, David Gerard dgerard@gmail.com wrote:
On 27/02/07, Gregory Maxwell gmaxwell@gmail.com wrote:
If we ever want it to be as easy to find images on commons as it is on commercial stock photography sites like Getty, we need to stop pretending that all categories are best used as tools for manual browsing.
If categories in MediaWiki can be turned into tags, and you can then do interesting tag queries (and you can be sure the system will be tested to its limit the moment it's possible), then quite a lot of the problems with categories on Commons go away without breaking the current category system on other wikis. Are the efforts to this end progressing at all?
You can do them right now with Mayflower. This is no accident. :) .. Mayflower supports giving you an intersection and then subtracting other categories from that intersection. More complex operations are easily possible, but I think that what we have is the most useful starting point.
For our purposes I don't see that using mayflower to get started on this is bad... it's easier and less risky to do development outside of MediaWiki proper, and Mayflower is free software running on our hardware, and written by a community member. So it's not like relying on Google.
The next step will be to raise awareness. Once more people know about this there will be less opposition to using categories as tags.
On 27/02/07, Gregory Maxwell gmaxwell@gmail.com wrote:
The next step will be to raise awareness. Once more people know about this there will be less opposition to using categories as tags.
If Mayflower was put on en:wp, would the toolserver melt?
- d.
On 2/27/07, David Gerard dgerard@gmail.com wrote:
The next step will be to raise awareness. Once more people know about this there will be less opposition to using categories as tags.
If Mayflower was put on en:wp, would the toolserver melt?
Probably, but that is being fixed.
"Gregory Maxwell" gmaxwell@gmail.com wrote on Tuesday, February 27, 2007 2:37 PM:
The next step will be to raise awareness. Once more people know about this there will be less opposition to using categories as tags.
I say it again: What about categories without spaces or underscores (one word a category) and article/catscans for the combinations?
Regards,
Flo
On 2/27/07, Florian Straub flominator@gmx.net wrote:
"Gregory Maxwell" gmaxwell@gmail.com wrote on Tuesday, February 27, 2007 2:37 PM:
The next step will be to raise awareness. Once more people know about this there will be less opposition to using categories as tags.
I say it again: What about categories without spaces or underscores (one word a category) and article/catscans for the combinations?
"One word" is a little draconian.
For example, achieving "United Nations" by intersecting "United" and "Nations" is probably wrongheaded.
Ideally a category will express one idea.. membership in one group, or relationship to a single collection of things. They should not be precomputed intersections of distinct things such as "Images of overweight men smoking cigars who were US presidents".
If cats express one idea or group then many will be one word, but not all. :)
I also don't think it's wrong to have some categories effectively be precomputed intersections, if they are particularly useful ones or interesting ones... but that shouldn't excuse us from also putting in the more general ones that apply. More data is usually useful.