On 9/26/2011 8:06 PM, とある白い猫 wrote:
Certainly it is their first attempt in plant identification. They are dealing with very few species of plant as well. Such phone apps could eventually be very useful as for example you point your phone at... anything and it links to the wikipedia article - in the language setting of your phone! Consider this labs progress in the next few decades... To them we are just another image repository but unlike other image repositories we are seeking to tag images in less specialized manner. For instance an image repository for birds would generally avoid tagging birds by color but instead by species or some other scientific categorization, commons on the other hand wouldn't mind such categorization.
As I said before, people shouldn't expect perfect species identification overnight but perhaps this could be the result of research for the next few decades. They currently are considering dropping commons because they do not see any use for commons, we could show them potential fields of research using commons by basically telling them about our problems. For instance dealing with images we frequently delete (over copyright, trolling and etc) could easily be a task for them.
-- とある白い猫 (To Aru Shiroi Neko)
I was talking with my wife about this, who is a bit of a botanist. Tree leaves are often easy, but a wider identification of plants requires inspection of the flowers and other details. And it can be hard; I've seen professors absolutely baffled by weeds with a dominating presence in the environment. My wife and I spent a week in the Dominican Republic, where we found 30 different species of weeds -- all of which are endemic across the tropical world... We were able to identify fewer than half of them and did worse with the grasses.
Using machine vision or not, the world could use the mobile app that has an expert system that helps people identify plants -- I'd think that databases like DBpedia would be a good place to start.
Now, if you want to use commons images in a machine vision project, you need to find something specific for which enough information is available. You're going to need about 10,000 images for training in test purposes. That can be anything from big class to maybe 100 classes with 100 examples each. You probably can extract something like his out of Commons using semantic information from DBpedia and Freebase.