Siamese networks and image classification

List overview All Threads
Download

newer

older

Books & Bytes – Issue 26, December...

RfC about account creation logs...

John Erling Blad

14 Jan 2018 14 Jan '18

11:46 a.m.

Has anyone tried to use a Siamese neural network for image classification at Commons? I don't know if it will be good enough to run in autonomous mode, but it will probably be a huge help for those that do manual classification.

Imagine a network providing a list of possible categories, and the user just ticks off usable categories.

A Siamese network can be learned by using a triplet loss function, where the anchor and the positive candidate comes from the same category, and the negative candidate comes from an other category but are otherwise close to the anchor.

Output from the network is like a fingerprint, and those fingerprints can be compared to other images with known fingerprints, or against a generalized fingerprint for a category.

John Erling Blad

Show replies by date

Pine W

14 Jan 14 Jan

7:16 p.m.

Hi John,

I have not heard of an initiative to use Siamese neural networks for image classifications on on Commons. You might make a suggestion on the AI, Research, and/or Commons mailing lists regarding this idea. You might also make a suggestion in IdeaLab https://meta.wikimedia.org/wiki/Grants:IdeaLab.

Pine https://meta.wikimedia.org/wiki/User:Pine https://en.wikipedia.org/wiki/User:CatherineMunro/Bright_Places

On Sun, Jan 14, 2018 at 3:46 AM, John Erling Blad jeblad@gmail.com wrote:

...

Has anyone tried to use a Siamese neural network for image classification at Commons? I don't know if it will be good enough to run in autonomous mode, but it will probably be a huge help for those that do manual classification.

Imagine a network providing a list of possible categories, and the user just ticks off usable categories.

A Siamese network can be learned by using a triplet loss function, where the anchor and the positive candidate comes from the same category, and the negative candidate comes from an other category but are otherwise close to the anchor.

Output from the network is like a fingerprint, and those fingerprints can be compared to other images with known fingerprints, or against a generalized fingerprint for a category.

John Erling Blad _______________________________________________ Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/ wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/ wiki/Wikimedia-l New messages to: Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe

John Erling Blad

15 Jan 15 Jan

1:55 p.m.

This is the same as the entry on the wishlist for 2016, but describes the actual method. https://meta.wikimedia.org/wiki/2016_Community_Wishlist_Survey/Categories/Co...

Both contrastive and triplet loss can be used while learning, but neither are described at Wikipedia.

On Sun, Jan 14, 2018 at 8:16 PM, Pine W wiki.pine@gmail.com wrote:

...

Hi John,

I have not heard of an initiative to use Siamese neural networks for image classifications on on Commons. You might make a suggestion on the AI, Research, and/or Commons mailing lists regarding this idea. You might also make a suggestion in IdeaLab https://meta.wikimedia.org/wiki/Grants:IdeaLab.

Pine https://meta.wikimedia.org/wiki/User:Pine https://en.wikipedia.org/wiki/User:CatherineMunro/Bright_Places

On Sun, Jan 14, 2018 at 3:46 AM, John Erling Blad jeblad@gmail.com wrote:

...
Has anyone tried to use a Siamese neural network for image classification at Commons? I don't know if it will be good enough to run in autonomous mode, but it will probably be a huge help for those that do manual classification.

Imagine a network providing a list of possible categories, and the user just ticks off usable categories.

A Siamese network can be learned by using a triplet loss function, where the anchor and the positive candidate comes from the same category, and

the

...
negative candidate comes from an other category but are otherwise close

to

...
the anchor.

Output from the network is like a fingerprint, and those fingerprints can be compared to other images with known fingerprints, or against a generalized fingerprint for a category.

John Erling Blad _______________________________________________ Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/ wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/ wiki/Wikimedia-l New messages to: Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe

Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/ wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/ wiki/Wikimedia-l New messages to: Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe

Pine W

22 Jan 22 Jan

5:47 a.m.

Hi John,

I am having a little trouble with understanding your email from January 15th. Could you perhaps state your question or point in a different way?

Pine https://meta.wikimedia.org/wiki/User:Pine https://en.wikipedia.org/wiki/User:CatherineMunro/Bright_Places

On Mon, Jan 15, 2018 at 5:55 AM, John Erling Blad jeblad@gmail.com wrote:

...

This is the same as the entry on the wishlist for 2016, but describes the actual method. https://meta.wikimedia.org/wiki/2016_Community_Wishlist_ Survey/Categories/Commons#Use_computer_vision_to_propose_categories

Both contrastive and triplet loss can be used while learning, but neither are described at Wikipedia.

On Sun, Jan 14, 2018 at 8:16 PM, Pine W wiki.pine@gmail.com wrote:

...
Hi John,

I have not heard of an initiative to use Siamese neural networks for

image

...
classifications on on Commons. You might make a suggestion on the AI, Research, and/or Commons mailing lists regarding this idea. You might

also

...
make a suggestion in IdeaLab https://meta.wikimedia.org/wiki/Grants:IdeaLab.

Pine https://meta.wikimedia.org/wiki/User:Pine https://en.wikipedia.org/wiki/User:CatherineMunro/Bright_Places

On Sun, Jan 14, 2018 at 3:46 AM, John Erling Blad jeblad@gmail.com wrote:

...
Has anyone tried to use a Siamese neural network for image

classification

...
...
at Commons? I don't know if it will be good enough to run in autonomous mode, but it will probably be a huge help for those that do manual classification.

Imagine a network providing a list of possible categories, and the user just ticks off usable categories.

A Siamese network can be learned by using a triplet loss function,

where

...
...
the anchor and the positive candidate comes from the same category, and

the

...
negative candidate comes from an other category but are otherwise close

to

...
the anchor.

Output from the network is like a fingerprint, and those fingerprints

can

...
...
be compared to other images with known fingerprints, or against a generalized fingerprint for a category.

John Erling Blad _______________________________________________ Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/ wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/ wiki/Wikimedia-l New messages to: Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe

Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/ wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/ wiki/Wikimedia-l New messages to: Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe

Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/ wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/ wiki/Wikimedia-l New messages to: Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe

John Erling Blad

24 Jan 24 Jan

9:20 p.m.

I had a plan to do a more thorough description on meta, but my plans are always to optimistic… ;)

Question 1: I wonder if anyone has done any work on categorization of images on Commons by using neural nets. Question 2: I wonder if anyone has used Siamese networks to do more general categorization of images.

A Siamese network is a kind of network used for face recognition. It is called Siamese network because it is two parallel networks in its simplest form, but one of the networks are precomputed so in effect you only evaluates one of the networks. Usually you have one face provided by a camera and run through one network, and a number of other precomputed faces for the other network. When the difference is small, then you have a face match.

In general you can have any kind of image as long as you can train a kind of fingerprint type of vector. This fingerprint will then acts like locality-sensitive hashing [1]. Because you can create this LSH from the vector, you can store alternate models in database, so you can later search it and find usable models. Those models can then interpret the vector form of the fingerprint, and by evaluating those models with actual input (ie the fingerprint vector) you can get an estimate for how probable it is that the image belong to a specific category.

What you would get from the previous process is a list of probable categorizations. That list can be sorted, and a user trying to categorize an image can then chose to select some of the categories.

Note that the output of the network is not just a few categories, it can be a variant number of models that outputs a probability for a specific category. It is a bit like a map-reduce, where the you find (filter) the possible models and then evaluate (map) those models with the fingerprint vector.

It is perhaps not obvious, but the idea behind Siamese networks are two parallel networks, but the implementation is with a single active network. Also the implemented network has a first part that computes the fingerprint vector (kind of a bottleneck network) and the second part are the stored models that takes this fingerprint vector and calculates a single output for the probability of a single category.

[1] https://en.wikipedia.org/wiki/Locality-sensitive_hashing

On Mon, Jan 22, 2018 at 6:47 AM, Pine W wiki.pine@gmail.com wrote:

...

Hi John,

I am having a little trouble with understanding your email from January 15th. Could you perhaps state your question or point in a different way?

Pine https://meta.wikimedia.org/wiki/User:Pine https://en.wikipedia.org/wiki/User:CatherineMunro/Bright_Places

On Mon, Jan 15, 2018 at 5:55 AM, John Erling Blad jeblad@gmail.com wrote:

...
This is the same as the entry on the wishlist for 2016, but describes the actual method. https://meta.wikimedia.org/wiki/2016_Community_Wishlist_ Survey/Categories/Commons#Use_computer_vision_to_propose_categories

Both contrastive and triplet loss can be used while learning, but neither are described at Wikipedia.

On Sun, Jan 14, 2018 at 8:16 PM, Pine W wiki.pine@gmail.com wrote:

...
Hi John,

I have not heard of an initiative to use Siamese neural networks for

image

...
classifications on on Commons. You might make a suggestion on the AI, Research, and/or Commons mailing lists regarding this idea. You might

also

...
make a suggestion in IdeaLab https://meta.wikimedia.org/wiki/Grants:IdeaLab.

Pine https://meta.wikimedia.org/wiki/User:Pine https://en.wikipedia.org/wiki/User:CatherineMunro/Bright_Places

On Sun, Jan 14, 2018 at 3:46 AM, John Erling Blad jeblad@gmail.com wrote:

...
Has anyone tried to use a Siamese neural network for image

classification

...
...
at Commons? I don't know if it will be good enough to run in

autonomous

...
...
...
mode, but it will probably be a huge help for those that do manual classification.

Imagine a network providing a list of possible categories, and the

user

...
...
...
just ticks off usable categories.

A Siamese network can be learned by using a triplet loss function,

where

...
...
the anchor and the positive candidate comes from the same category,

and

...
...
the

...
negative candidate comes from an other category but are otherwise

close

...
...
to

...
the anchor.

Output from the network is like a fingerprint, and those fingerprints

can

...
...
be compared to other images with known fingerprints, or against a generalized fingerprint for a category.

John Erling Blad _______________________________________________ Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/ wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/ wiki/Wikimedia-l New messages to: Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/

mailman/listinfo/wikimedia-l,

...
...
...
mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe

Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/ wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/ wiki/Wikimedia-l New messages to: Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe

Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/ wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/ wiki/Wikimedia-l New messages to: Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe

Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/ wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/ wiki/Wikimedia-l New messages to: Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe

Pine W

31 Jan 31 Jan

6:18 a.m.

Hi John,

I think that these questions might be best asked on the AI mailing list, so I'm copying this thread to that list. I would suggest that further discussion should take place on the AI mailing list, because that's where the WMF experts on AI subjects are likely to participate.

Thanks for your interest in these questions. I too would be interested in automating image categorization for Commons. I think that full automation of image categorization would be a long term project, perhaps over the course of decades, but there may be ways to make incremental progress. :)

Pine https://meta.wikimedia.org/wiki/User:Pine https://en.wikipedia.org/wiki/User:CatherineMunro/Bright_Places

On Wed, Jan 24, 2018 at 1:20 PM, John Erling Blad jeblad@gmail.com wrote:

...

I had a plan to do a more thorough description on meta, but my plans are always to optimistic… ;)

Question 1: I wonder if anyone has done any work on categorization of images on Commons by using neural nets. Question 2: I wonder if anyone has used Siamese networks to do more general categorization of images.

A Siamese network is a kind of network used for face recognition. It is called Siamese network because it is two parallel networks in its simplest form, but one of the networks are precomputed so in effect you only evaluates one of the networks. Usually you have one face provided by a camera and run through one network, and a number of other precomputed faces for the other network. When the difference is small, then you have a face match.

In general you can have any kind of image as long as you can train a kind of fingerprint type of vector. This fingerprint will then acts like locality-sensitive hashing [1]. Because you can create this LSH from the vector, you can store alternate models in database, so you can later search it and find usable models. Those models can then interpret the vector form of the fingerprint, and by evaluating those models with actual input (ie the fingerprint vector) you can get an estimate for how probable it is that the image belong to a specific category.

What you would get from the previous process is a list of probable categorizations. That list can be sorted, and a user trying to categorize an image can then chose to select some of the categories.

Note that the output of the network is not just a few categories, it can be a variant number of models that outputs a probability for a specific category. It is a bit like a map-reduce, where the you find (filter) the possible models and then evaluate (map) those models with the fingerprint vector.

It is perhaps not obvious, but the idea behind Siamese networks are two parallel networks, but the implementation is with a single active network. Also the implemented network has a first part that computes the fingerprint vector (kind of a bottleneck network) and the second part are the stored models that takes this fingerprint vector and calculates a single output for the probability of a single category.

[1] https://en.wikipedia.org/wiki/Locality-sensitive_hashing

On Mon, Jan 22, 2018 at 6:47 AM, Pine W wiki.pine@gmail.com wrote:

...
Hi John,

I am having a little trouble with understanding your email from January 15th. Could you perhaps state your question or point in a different way?

Pine https://meta.wikimedia.org/wiki/User:Pine https://en.wikipedia.org/wiki/User:CatherineMunro/Bright_Places

On Mon, Jan 15, 2018 at 5:55 AM, John Erling Blad jeblad@gmail.com wrote:

...
This is the same as the entry on the wishlist for 2016, but describes

the

...
...
actual method. https://meta.wikimedia.org/wiki/2016_Community_Wishlist_ Survey/Categories/Commons#Use_computer_vision_to_propose_categories

Both contrastive and triplet loss can be used while learning, but

neither

...
...
are described at Wikipedia.

On Sun, Jan 14, 2018 at 8:16 PM, Pine W wiki.pine@gmail.com wrote:

...
Hi John,

I have not heard of an initiative to use Siamese neural networks for

image

...
classifications on on Commons. You might make a suggestion on the AI, Research, and/or Commons mailing lists regarding this idea. You might

also

...
make a suggestion in IdeaLab https://meta.wikimedia.org/wiki/Grants:IdeaLab.

Pine https://meta.wikimedia.org/wiki/User:Pine https://en.wikipedia.org/wiki/User:CatherineMunro/Bright_Places

On Sun, Jan 14, 2018 at 3:46 AM, John Erling Blad jeblad@gmail.com wrote:

...
Has anyone tried to use a Siamese neural network for image

classification

...
...
at Commons? I don't know if it will be good enough to run in

autonomous

...
...
...
mode, but it will probably be a huge help for those that do manual classification.

Imagine a network providing a list of possible categories, and the

user

...
...
...
just ticks off usable categories.

A Siamese network can be learned by using a triplet loss function,

where

...
...
the anchor and the positive candidate comes from the same category,

and

...
...
the

...
negative candidate comes from an other category but are otherwise

close

...
...
to

...
the anchor.

Output from the network is like a fingerprint, and those

fingerprints

...
...
can

...
...
be compared to other images with known fingerprints, or against a generalized fingerprint for a category.

John Erling Blad

2351

Age (days ago)

2368

Last active (days ago)

wikimedia-l@lists.wikimedia.org

5 comments

2 participants

tags (0)

participants (2)

John Erling Blad
Pine W