[Commons-l] Automated identification of images on commons
Andre Engels
andreengels at gmail.com
Mon Sep 26 17:32:37 UTC 2011
On Mon, Sep 26, 2011 at 6:43 PM, Paul Houle <paul at ontology2.com> wrote:
> **
> I've made some attempt to map images on Wikimedia commons to
> distinct concepts from DBpedia, see
>
> http://ookaboo.com/
>
> This could be useful for forming a training set, but I haven't yet
> got around to releasing a public dump of the data. I have about 1 million
> things classified and could certainly extend the strategies used to get
> more.
>
> Unless there's been a really unprecedented breakthrough, I'd think
> that the application of machine vision to Wikimedia faces the problem of
> getting enough training data. If you had thousands or tens of thousands of
> photos that were labeled 'cat' or 'not cat', or 'member of plant species X'
> or 'not member of plant species X', you can train a classifier to make the
> distinction. However, if you've got two or three bad photos of a
> particular plant (which is what you have most of the times in Commons) you
> don't have enough training data to generalize.
>
> If you've got a specific mission, say genitals recognition, I think
> you can make progress, but to attack the general problem you need to go big
> with your training sets.
>
Every small category is a part of a big category. A system such as this will
not be able to specify plant species, but it might well be able to find
pictures of plants. If it then gives a list of plant pictures that are not
in some plant category, animal pictures that are not in animal category,
buildings that are not in a regional building category, maps that are not in
a map of category, paintings that are not in a painter category, famous
people that are not in a people category etcetera, it could deliver those to
volunteers to further classify.
--
André Engels, andreengels at gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.wikimedia.org/pipermail/commons-l/attachments/20110926/51f11350/attachment.htm
More information about the Commons-l
mailing list