I've been contributing to Wikipedia since 2001 and to Wikimedia Commons since 2005. I'm not new to this. Wikimedia Commons now has millions of images and thousands of categories. But it's really hard to find anything.
Today I was looking for photos of buildings that have an exterior lantern on their corner. This is common in the countryside, where there are no streetlights and no other buildings nearby. You put a lantern on the outside of your house, and you put it on the corner to cover two sides with one lamp.
So there are categories for buildings and for houses. And by country and by county. And by building material. And there is a category for exterior lanterns, but none of the photos there are really corner-mounted.
When I arrive at Commons, there is a search box and so I type "exterior lantern corner", but all it does is to search text content and descriptions. So I get lots of hits in scanned documents, which is not what I wanted. I can limit my search to the "File" namespace, but that includes both JPEG photos and PDF documents.
Here are indeed some corner lamps, but not the rural kind I wanted, https://commons.wikimedia.org/wiki/File:Tinghuset_(lanterne).jpg https://commons.wikimedia.org/wiki/File:Fanal_p%C3%BAblic_LED.JPG
Here is what I search, but the lamp (in the upper left) is too small to be useful in this photo, https://commons.wikimedia.org/wiki/File:V%C3%A5rdinge_f%C3%B6rsamlingshem.JP...
This is where I want to find more images like these, not in the same category, just similar photos. Surely, such a search function could be implemented in 2020, in addition to our current text search? Hasn't anybody done this already?
Special:MediaSearch seems to work better: https://commons.wikimedia.org/wiki/Special:MediaSearch?type=bitmap&q=ext... , if I've understood what you're looking for.
Kosta
On 16/11/2020 01:18, Lars Aronsson wrote:
I've been contributing to Wikipedia since 2001 and to Wikimedia Commons since 2005. I'm not new to this. Wikimedia Commons now has millions of images and thousands of categories. But it's really hard to find anything.
Today I was looking for photos of buildings that have an exterior lantern on their corner. This is common in the countryside, where there are no streetlights and no other buildings nearby. You put a lantern on the outside of your house, and you put it on the corner to cover two sides with one lamp.
So there are categories for buildings and for houses. And by country and by county. And by building material. And there is a category for exterior lanterns, but none of the photos there are really corner-mounted.
When I arrive at Commons, there is a search box and so I type "exterior lantern corner", but all it does is to search text content and descriptions. So I get lots of hits in scanned documents, which is not what I wanted. I can limit my search to the "File" namespace, but that includes both JPEG photos and PDF documents.
Here are indeed some corner lamps, but not the rural kind I wanted, https://commons.wikimedia.org/wiki/File:Tinghuset_(lanterne).jpg https://commons.wikimedia.org/wiki/File:Fanal_p%C3%BAblic_LED.JPG
Here is what I search, but the lamp (in the upper left) is too small to be useful in this photo, https://commons.wikimedia.org/wiki/File:V%C3%A5rdinge_f%C3%B6rsamlingshem.JP...
This is where I want to find more images like these, not in the same category, just similar photos. Surely, such a search function could be implemented in 2020, in addition to our current text search? Hasn't anybody done this already?
Hi Lars–
Searching on Commons does indeed leave much to be desired. My team (Structured Data) has been working on a new search experience that is hopefully better; it involves both some back-end improvements to the algorithm (adding structured data on File pages to the search index) as well as a new UI that’s better suited for media-based results.
This project is still kind of in beta but it’s live now at Special:MediaSearch https://commons.wikimedia.org/wiki/Special:MediaSearch?type=bitmap. We’re still making improvements to both the back-end search as well as the user experience. Here’s a little more about the project; feel free to share any thoughts or feedback on the discussion tab.
https://commons.wikimedia.org/wiki/Commons:Structured_data/Media_search https://commons.wikimedia.org/wiki/Commons:Structured_data/Media_search
Best,
Eric
—
Eric Gardner Software Engineer, Structured Data Wikimedia Foundation
On Nov 16, 2020, at 1:38 AM, Kosta Harlan kharlan@wikimedia.org wrote:
Special:MediaSearch seems to work better: https://commons.wikimedia.org/wiki/Special:MediaSearch?type=bitmap&q=ext... , if I've understood what you're looking for.
Kosta
On 16/11/2020 01:18, Lars Aronsson wrote:
I've been contributing to Wikipedia since 2001 and to Wikimedia Commons since 2005. I'm not new to this. Wikimedia Commons now has millions of images and thousands of categories. But it's really hard to find anything.
Today I was looking for photos of buildings that have an exterior lantern on their corner. This is common in the countryside, where there are no streetlights and no other buildings nearby. You put a lantern on the outside of your house, and you put it on the corner to cover two sides with one lamp.
So there are categories for buildings and for houses. And by country and by county. And by building material. And there is a category for exterior lanterns, but none of the photos there are really corner-mounted.
When I arrive at Commons, there is a search box and so I type "exterior lantern corner", but all it does is to search text content and descriptions. So I get lots of hits in scanned documents, which is not what I wanted. I can limit my search to the "File" namespace, but that includes both JPEG photos and PDF documents.
Here are indeed some corner lamps, but not the rural kind I wanted, https://commons.wikimedia.org/wiki/File:Tinghuset_(lanterne).jpg https://commons.wikimedia.org/wiki/File:Fanal_p%C3%BAblic_LED.JPG
Here is what I search, but the lamp (in the upper left) is too small to be useful in this photo, https://commons.wikimedia.org/wiki/File:V%C3%A5rdinge_f%C3%B6rsamlingshem.JP...
This is where I want to find more images like these, not in the same category, just similar photos. Surely, such a search function could be implemented in 2020, in addition to our current text search? Hasn't anybody done this already?
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
On Nov 15, 2020, at 7:18 PM, Lars Aronsson lars@aronsson.se wrote:
This is where I want to find more images like these, not in the same category, just similar photos.
Perhaps not exactly what was asked, but Google Images has the ability. Go to images.google.com http://images.google.com/, click the camera icon, and then you can paste a URL for the image. Note, you want the raw image, not the commons page, so in your case, https://upload.wikimedia.org/wikipedia/commons/8/89/V%C3%A5rdinge_f%C3%B6rsa... https://upload.wikimedia.org/wikipedia/commons/8/89/V%C3%A5rdinge_f%C3%B6rsamlingshem.JPG
Not surprisingly, in this particular example, it gives you back photos of other red farm houses, and doesn't understand that it's the corner lantern that you're really interested in. But still a useful tool to be aware of.
On 2020-11-16 17:08, Roy Smith wrote:
Perhaps not exactly what was asked, but Google Images has the ability. Go to images.google.com http://images.google.com, click the camera icon, and then you can paste a URL for the image.
This is perhaps the best suggestion so far. But it is limited.
Here I can input images from e-retail websites, screenshots from Google Streetview, or just any copyrighted photos and find similar photos. If I add site:wikimedia.org to that search, I will find free images on Wikimedia Commons. Good!
But the search is not so very useful. If I submit a photo where much of the building is visible, I get images of buildings in general, with or without lamps. If I submit a photo where much blue sky background is visible, I get light-blue images in general, with or without buildings or lamps. There is a rough visual similarity to the images, but not on a detailed level and not by semantic content.
The search does not allow me, as I would wish, to refine the search by indicating which images are better or worse for my search.
It would be interesting to know what kind of data model is behind the similarity function. My guess is that it needs to be 100 to 1000 times larger.
Both in OCR (optical character recognition) and in image similarity search, we now have so much data that we should be able to organize a huge databank of recognition data based on crowdsourcing. We'd need some game, where millions of users answer 1. What does this image depict? 2. How well does this image depict XYZ?
I was using the new search interface that Eric mentioned above and I was able to find a few exterior corner lamps indeed. So I'm wondering if you tried it and if you have any feedback:
https://commons.wikimedia.org/wiki/Special:MediaSearch?type=bitmap&q=ext...
On Tue, Nov 17, 2020 at 12:47 PM Lars Aronsson lars@aronsson.se wrote:
On 2020-11-16 17:08, Roy Smith wrote:
Perhaps not exactly what was asked, but Google Images has the ability. Go to images.google.com http://images.google.com, click the camera icon, and then you can paste a URL for the image.
This is perhaps the best suggestion so far. But it is limited.
Here I can input images from e-retail websites, screenshots from Google Streetview, or just any copyrighted photos and find similar photos. If I add site:wikimedia.org to that search, I will find free images on Wikimedia Commons. Good!
But the search is not so very useful. If I submit a photo where much of the building is visible, I get images of buildings in general, with or without lamps. If I submit a photo where much blue sky background is visible, I get light-blue images in general, with or without buildings or lamps. There is a rough visual similarity to the images, but not on a detailed level and not by semantic content.
The search does not allow me, as I would wish, to refine the search by indicating which images are better or worse for my search.
It would be interesting to know what kind of data model is behind the similarity function. My guess is that it needs to be 100 to 1000 times larger.
Both in OCR (optical character recognition) and in image similarity search, we now have so much data that we should be able to organize a huge databank of recognition data based on crowdsourcing. We'd need some game, where millions of users answer
- What does this image depict?
- How well does this image depict XYZ?
-- Lars Aronsson (lars@aronsson.se) Linköping, Sweden
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
wikitech-l@lists.wikimedia.org