I think that these questions might be best asked on the AI mailing list, so
I'm copying this thread to that list. I would suggest that further
discussion should take place on the AI mailing list, because that's where
the WMF experts on AI subjects are likely to participate.
Thanks for your interest in these questions. I too would be interested in
automating image categorization for Commons. I think that full automation
of image categorization would be a long term project, perhaps over the
course of decades, but there may be ways to make incremental progress. :)
On Wed, Jan 24, 2018 at 1:20 PM, John Erling Blad <jeblad(a)gmail.com> wrote:
> I had a plan to do a more thorough description on meta, but my plans are
> always to optimistic… ;)
> Question 1: I wonder if anyone has done any work on categorization of
> images on Commons by using neural nets.
> Question 2: I wonder if anyone has used Siamese networks to do more general
> categorization of images.
> A Siamese network is a kind of network used for face recognition. It is
> called Siamese network because it is two parallel networks in its simplest
> form, but one of the networks are precomputed so in effect you only
> evaluates one of the networks. Usually you have one face provided by a
> camera and run through one network, and a number of other precomputed faces
> for the other network. When the difference is small, then you have a face
> In general you can have any kind of image as long as you can train a kind
> of fingerprint type of vector. This fingerprint will then acts like
> locality-sensitive hashing . Because you can create this LSH from the
> vector, you can store alternate models in database, so you can later search
> it and find usable models. Those models can then interpret the vector form
> of the fingerprint, and by evaluating those models with actual input (ie
> the fingerprint vector) you can get an estimate for how probable it is that
> the image belong to a specific category.
> What you would get from the previous process is a list of probable
> categorizations. That list can be sorted, and a user trying to categorize
> an image can then chose to select some of the categories.
> Note that the output of the network is not just a few categories, it can be
> a variant number of models that outputs a probability for a specific
> category. It is a bit like a map-reduce, where the you find (filter) the
> possible models and then evaluate (map) those models with the fingerprint
> It is perhaps not obvious, but the idea behind Siamese networks are two
> parallel networks, but the implementation is with a single active network.
> Also the implemented network has a first part that computes the fingerprint
> vector (kind of a bottleneck network) and the second part are the stored
> models that takes this fingerprint vector and calculates a single output
> for the probability of a single category.
>  https://en.wikipedia.org/wiki/Locality-sensitive_hashing
> On Mon, Jan 22, 2018 at 6:47 AM, Pine W <wiki.pine(a)gmail.com> wrote:
> > Hi John,
> > I am having a little trouble with understanding your email from January
> > 15th. Could you perhaps state your question or point in a different way?
> > Pine <https://meta.wikimedia.org/wiki/User:Pine>
> > <https://en.wikipedia.org/wiki/User:CatherineMunro/Bright_Places>
> > On Mon, Jan 15, 2018 at 5:55 AM, John Erling Blad <jeblad(a)gmail.com>
> > wrote:
> > > This is the same as the entry on the wishlist for 2016, but describes
> > > actual method.
> > > https://meta.wikimedia.org/wiki/2016_Community_Wishlist_
> > > Survey/Categories/Commons#Use_computer_vision_to_propose_categories
> > >
> > > Both contrastive and triplet loss can be used while learning, but
> > > are described at Wikipedia.
> > >
> > > On Sun, Jan 14, 2018 at 8:16 PM, Pine W <wiki.pine(a)gmail.com> wrote:
> > >
> > > > Hi John,
> > > >
> > > > I have not heard of an initiative to use Siamese neural networks for
> > > image
> > > > classifications on on Commons. You might make a suggestion on the AI,
> > > > Research, and/or Commons mailing lists regarding this idea. You might
> > > also
> > > > make a suggestion in IdeaLab
> > > > <https://meta.wikimedia.org/wiki/Grants:IdeaLab>.
> > > >
> > > > Pine <https://meta.wikimedia.org/wiki/User:Pine>
> > > > <https://en.wikipedia.org/wiki/User:CatherineMunro/Bright_Places>
> > > >
> > > > On Sun, Jan 14, 2018 at 3:46 AM, John Erling Blad <jeblad(a)gmail.com>
> > > > wrote:
> > > >
> > > > > Has anyone tried to use a Siamese neural network for image
> > > classification
> > > > > at Commons? I don't know if it will be good enough to run in
> > autonomous
> > > > > mode, but it will probably be a huge help for those that do manual
> > > > > classification.
> > > > >
> > > > > Imagine a network providing a list of possible categories, and the
> > user
> > > > > just ticks off usable categories.
> > > > >
> > > > > A Siamese network can be learned by using a triplet loss function,
> > > where
> > > > > the anchor and the positive candidate comes from the same category,
> > and
> > > > the
> > > > > negative candidate comes from an other category but are otherwise
> > close
> > > > to
> > > > > the anchor.
> > > > >
> > > > > Output from the network is like a fingerprint, and those
> > > can
> > > > > be compared to other images with known fingerprints, or against a
> > > > > generalized fingerprint for a category.
> > > > >
> > > > > John Erling Blad
we  would like to announce a research project with the goal of studying
whether user interactions recorded at the time of editing are suitable to
predict vandalism in real time.
Should vandal editing behavior be sufficiently different from normal
editing behavior, this would allow for a number of interesting real-time
prevention techniques. For example:
- withholding confidently suspicious edits for review before publishing
- a popup asking "I am not a vandal" (as in Google's "I am not a robot") to
analyze vandal reactions,
- a popup with a chat box to personally engage vandals, e.g., to help them
find other ways of stress relief or to understand them better,
- or at the very least: a new signal to improve traditional vandalism
We have set up a laboratory environment to study editor behavior in a
realistic setting using a private mirror of Wikipedia. No editing
whatsoever is conducted on the real Wikipedia as part of our experiments,
and all test subjects of our user studies are made aware of the
experimental nature of their editing. We plan on making use of
crowdsourcing as a means to attain scale and diversity.
If you wish to participate in this study as a test subject yourself, please
get in touch. The more diversity, the more insightful the results will be.
We are also happy to collaborate and to answer all questions that may arise
in relation to the project. For example, our setup and tooling may turn out
to be useful to study other user behavior-related things without having to
actually deploy experiments within the live MediaWiki.
PS: The AICaptcha project seems most closely related. @Vinitha and Gergő:
If you wish, we can set up a Skype meeting to talk about a avenues for
 A group of students and researchers from Bauhaus-Universität Weimar (
www.webis.de) and Leipzig University (www.temir.org); project PI: Martin