I think this is a really good use of cross-lingual retrieval technology, because images
are language neutral. If the system returns an image of a tree found on an arabic site, I
can still understand the image and tell that it's a tree. In contrast, if I search for
*text* about trees and the system returns an arabic page on trees, I can't make use of
it unless I can at least read arabic.
One comment about the user interface of your prototype. It's too focused on the query
words, and not enough on the images. Right now, to find images, I have to:
A) Type a query (e.g. "tree")
B) Tell the system my query is in English
C) Select the meaning of the word I mean (e.g. "a large wooden plant").
D) Select the language I want to search in (e.g. arabic)
Only after I have done these things do I get to finally see images.
I would suggest the following changes:
Have the query language set to English by default (I'm French, so I'm allowed to
say that ;-), and have a pick list where people can change that (and store that preference
in a cookie).
Skip C) and D) altogether, and search for images that match all meanings and all
translations in all languages. Maybe that doesn't work well and produces too many
Alain Désilets, National Research Council of Canada
Chair, WikiSym 2007
2007 International Symposium on Wikis
Wikis at Work in the World:
Open, Organic, Participatory Media for the 21st Century
Behalf Of Brianna Laugher
Sent: September 12, 2007 10:22 PM
To: commons-l; wiktionary-l(a)lists.wikimedia.org; Wikimedia
Foundation Mailing List; wiki-research-l(a)lists.wikimedia.org
Subject: [Wiki-research-l] PanImages: cross-lingual image
search createdusing 'Wiktionaries'
PanImages Image Search Tool Speaks Hundreds Of Languages
PanImages' powerful brains were created by scanning more than
350 machine-readable online dictionaries. Some of these were
"wiktionaries," online multilingual dictionaries written by
volunteers. The PanImages software scans these dictionaries
and uses an algorithm to check the accuracy of the results.
It then assembles the results in a matrix that allows
translation in combinations that may never have been
attempted - for instance, from Gujarati to Lithuanian.
The actual search engine is here:
And the research paper detailing the algorithm and method is here:
An idea to use Wiktionary or interwiki links to improve image
search for Commons has long been kicked around. Maybe we
could collaborate with them to improve the Mayflower search
engine for Commons? (Or else ask them to index
and pay attention to license metadata
:)) After all, we supplied them with all this useful data for free.
They've just been waiting in a mountain for the right moment:
Wiki-research-l mailing list