Hello,
I am not sure this is the right mailing list to introduce this project but I have just released Displee. It is a small Android app that allows to search for images in the English Wikipedia by taking pictures: https://play.google.com/store/apps/details?id=org.visualink.displee It is a kind of open source Google Goggles for images from en.wikipedia.org.
I have developed Displee as a demonstrator of Pastec http://pastec.io, my open source image recognition index and search engine for mobile apps. The index hosted on my server in France currently contains about 440 000 images. They may not be the most relevant ones but this is a start. ;-) I have also other ideas to improve this tiny app if it has an interest for the community.
Displee source code (MIT) is available here: https://github.com/Visu4link/displee Pastec source code (LGPL) is available here: https://github.com/Visu4link/pastec The source code of the Displee back-end is not released yet. It is basically a python3 Django application.
I will be glad to receive your feedback and answer any question!
Best regards,
Hi Adrien,
this looks very interesting - I'm happy to see your work and I briefly looked into your sources and API. With your 440 000 images, do you have any clear idea about the accuracy of ORB? To explain: I'm working on Elog.io, which provides a *similar* service and API[1] as yours, but uses a rather different algorithm and store, and a different use case. Our algorithm is a variant of a Blockhash[2] algorithm, which does not do any feature detection at all, but which can easily run in a browser or mobile platform (we have versions for JavaScript, C and Python) to generate 256 bit hashes of images. With a hamming distance calculation, we then determine the quality of a match.
We work primarily on a use case of verbatim use, with a user getting images from Wikimedia and re-using them elsewhere. Algorithms without feature detection give very bad results for any modifications to an image, like rotating, cropping, etc. But since that's not within our use case, it works, though the flip side of if them is of course that you can't expect to photograph something (a newspaper article with an image for instance) and then match it against a set of images as you expect to be able to do.
The other difference is that our database store isn't specifically tailored to our hashes: we use W3C Media Annotations to store any kind of metadata about images, and could equally well store your ORB signatures assuming they can be serialised.
To give you some numbers, for our use cases (verbatim use, potentially with format change jpg->png etc, and scaling down to 100px width) we can successfully match ca 87% of cases, and we have a collision rate (different images resulting in same or near same hashes) of ca 1,2%. Both numbers against the Wikimedia Commons set.
While we currently have the full ~22M images from Wikimedia Commons in our database, we're still ironing out the kinks of the system and making some additional improvements. If you think that we should consider ORB instead of or in addition to our current algorithms, we'd love to give that a try, and it'd obviously be very interesting if we could end up having compatible signatures compared to your database.
Sincerely, Jonas
[1] http://docs.cmcatalog.apiary.io [2] http://blockhash.io
Jonas
On 24 November 2014 at 11:25, Adrien Maglo adrien@visualink.io wrote:
Hello,
I am not sure this is the right mailing list to introduce this project but I have just released Displee. It is a small Android app that allows to search for images in the English Wikipedia by taking pictures: https://play.google.com/store/apps/details?id=org.visualink.displee It is a kind of open source Google Goggles for images from en.wikipedia.org.
I have developed Displee as a demonstrator of Pastec http://pastec.io, my open source image recognition index and search engine for mobile apps. The index hosted on my server in France currently contains about 440 000 images. They may not be the most relevant ones but this is a start. ;-) I have also other ideas to improve this tiny app if it has an interest for the community.
Displee source code (MIT) is available here: https://github.com/Visu4link/displee Pastec source code (LGPL) is available here: https://github.com/Visu4link/pastec The source code of the Displee back-end is not released yet. It is basically a python3 Django application.
I will be glad to receive your feedback and answer any question!
Best regards,
-- Adrien Maglo Pastec developer http://www.pastec.io +33 6 27 94 34 41
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Le 24/11/2014 13:32, Jonas Öberg a écrit :
Hi Adrien,
Hi Jonas,
[...]
Thank you for your email. This is very interesting to learn about your project! I think you are targeting a big issue of photographers.
While we currently have the full ~22M images from Wikimedia Commons in our database, we're still ironing out the kinks of the system and making some additional improvements. If you think that we should consider ORB instead of or in addition to our current algorithms, we'd love to give that a try, and it'd obviously be very interesting if we could end up having compatible signatures compared to your database.
Your hash approach uses very small signatures, which can be quickly transmitted over a network and searched. Using the "visual word" approach I use in Pastec would enable the matching of modified images but would also require a lot more resources. Thus, while your hash is 256 bits long, an image signature in the Pastec index is approximately 8 KB. Similarly, I guess that the search complexity of your hash approach is o(1) while in Pastec this is much more complicated: first "tf-idf" ranking and then two geometrical rerankings... Finally, extracting ORB features is, I think, not something that could be done quickly in Javascript.
So my conclusion would be that, if you are fine with not matching modified images, ORB features will not be useful for your project. :-)
Best regards,
Hi Adrien!
Using the "visual word" approach I use in Pastec would enable the matching of modified images but would also require a lot more resources. Thus, while your hash is 256 bits long, an image signature in the Pastec index is approximately 8 KB.
8 KB still isn't too bad. It sounds like it could be useful.
Similarly, I guess that the search complexity of your hash approach is o(1) while in Pastec this is much more complicated: first "tf-idf" ranking and then two geometrical rerankings...
Close to o(1) at least. How does Pastec scale to many images? You mentioned having about 400,000 currently, which is still a rather fair number, but what about the full ~22M of Wikimedia Commons? I'm assuming that since tf-idf is a well known method for text mining, there are well understood and optimised algorithms to search. Perhaps something like Elasticsearch would be useful right away too?
That would be an advantage, since with our blockhash, we've had to implement relevant search algorithms ourselves lacking existing implementations.
One problem that we see and which was discussed recently on the commons-l mailing list, is the possibility of using approaches like yours and ours to identify duplicate images in Commons. We've generated a list of 21274 duplicate pairs, but some of them aren't actually duplicates, just very similar. Most commonly this is map data, like [1] and [2], where just a specific region differ.
I'm hypothesizing that your ORB detection would have better success there, since it would hopefully detect the colored area as a feature and be able to distinguish the two from each other.
In general, my feeling is that your work with ORB and our work with Blockhashes complement each other nicely. They work with different use cases, but have the same purpose, so being able to search using both would sometimes be an advantage. What is your strategy for scaling beyond your existing 400,000 images and is there some way we can cooperate on this? As we go about hashing additional sets (Flickr is a prime candidate), it would be interesting for us if we could generate both our blockhash and your ORB visual words signature in an easy way, since we any way retrieve the images.
[1] https://commons.wikimedia.org/wiki/File:Locator_map_Puerto_Rico_Trujillo_Alt... [2] https://commons.wikimedia.org/wiki/File:Locator_map_Puerto_Rico_Carolina.png
Hello Jonas,
Similarly, I guess that the search complexity of your hash approach is o(1) while in Pastec this is much more complicated: first "tf-idf" ranking and then two geometrical rerankings...
Close to o(1) at least. How does Pastec scale to many images? You mentioned having about 400,000 currently, which is still a rather fair number, but what about the full ~22M of Wikimedia Commons? I'm assuming that since tf-idf is a well known method for text mining, there are well understood and optimised algorithms to search. Perhaps something like Elasticsearch would be useful right away too?
That would be an advantage, since with our blockhash, we've had to implement relevant search algorithms ourselves lacking existing implementations.
The tf-idf method used in Pastec is an adaptation of the algorithm for image ranking. So unfortunately, it seems also complicated to reuse implementations designed for texts. To return results in real-time, the Pastec index must fit into the RAM. Having about 1M images per instance seems possible but to target the 22M of Wikimedia Commons, several instances running on several servers would be required. Where there is many images on an instance, the search times also increase significantly.
One problem that we see and which was discussed recently on the commons-l mailing list, is the possibility of using approaches like yours and ours to identify duplicate images in Commons. We've generated a list of 21274 duplicate pairs, but some of them aren't actually duplicates, just very similar. Most commonly this is map data, like [1] and [2], where just a specific region differ.
I'm hypothesizing that your ORB detection would have better success there, since it would hopefully detect the colored area as a feature and be able to distinguish the two from each other.
Unfortunately, ORBs won't help you more here. They are computed only on the luminance place and are located at edge zones. They aim at retrieving similar images and in your example, the two images are perfect candidates for that.
In general, my feeling is that your work with ORB and our work with Blockhashes complement each other nicely. They work with different use cases, but have the same purpose, so being able to search using both would sometimes be an advantage. What is your strategy for scaling beyond your existing 400,000 images and is there some way we can cooperate on this? As we go about hashing additional sets (Flickr is a prime candidate), it would be interesting for us if we could generate both our blockhash and your ORB visual words signature in an easy way, since we any way retrieve the images.
Currently, I am not planning to scale a lot over ~1M images as I do not have the computational resources. I think that your small hash approach, despite less robust to image modifications, is way more adapted to target such databases. It would be possible to store and search the index on disk but that would be very slow and thus not practical.
Best regards,
CC'ing mobile-l to let them know of this
Exciting to see projects like this showing up. I do notice that it's not compatible with Nexus5, Nexus7, and a number of other standard devices.
Looking at your manifest it should just work. Did you put in a Google Play restriction on which devices it works on?
--tomasz
On Mon, Nov 24, 2014 at 2:25 AM, Adrien Maglo adrien@visualink.io wrote:
Hello,
I am not sure this is the right mailing list to introduce this project but I have just released Displee. It is a small Android app that allows to search for images in the English Wikipedia by taking pictures: https://play.google.com/store/apps/details?id=org.visualink.displee It is a kind of open source Google Goggles for images from en.wikipedia.org.
I have developed Displee as a demonstrator of Pastec http://pastec.io, my open source image recognition index and search engine for mobile apps. The index hosted on my server in France currently contains about 440 000 images. They may not be the most relevant ones but this is a start. ;-) I have also other ideas to improve this tiny app if it has an interest for the community.
Displee source code (MIT) is available here: https://github.com/Visu4link/displee Pastec source code (LGPL) is available here: https://github.com/Visu4link/pastec The source code of the Displee back-end is not released yet. It is basically a python3 Django application.
I will be glad to receive your feedback and answer any question!
Best regards,
-- Adrien Maglo Pastec developer http://www.pastec.io +33 6 27 94 34 41
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Le 24/11/2014 20:28, Tomasz Finc a écrit :
CC'ing mobile-l to let them know of this
Thanks!
Exciting to see projects like this showing up. I do notice that it's not compatible with Nexus5, Nexus7, and a number of other standard devices.
Looking at your manifest it should just work. Did you put in a Google Play restriction on which devices it works on?
Yes indeed, it should work on those devices too. I did not put any restriction on the Play store but I will investigate that.
Best regards,
Le 24/11/2014 20:46, Adrien Maglo a écrit :
Exciting to see projects like this showing up. I do notice that it's not compatible with Nexus5, Nexus7, and a number of other standard devices.
Looking at your manifest it should just work. Did you put in a Google Play restriction on which devices it works on?
Yes indeed, it should work on those devices too. I did not put any restriction on the Play store but I will investigate that.
I may have forgotten to enable it for all the countries. This might be the reason of the issue you encountered. This should be updated in the next hours. Else, the APK can be downloaded here: http://pastec.io/files/Displee-2.0.1.apk
Best regards,
wikitech-l@lists.wikimedia.org