[Foundation-l] Darwinian search on wikipedia
Benjamin Piwowarski
benjamin at bpiwowar.net
Tue May 26 11:00:31 UTC 2009
Hi,
I hope this is the right list to post this email - otherwise I would
appreciate being directed to the right one.
Shortly, I would like to promote a project for opening the search box
to external entities. My main motivation would be shared by many
researchers in interactive information retrieval (IIR): In order to
run experiments about new techniques in IIR, it is necessary to
evaluate them, and hence to have enough users to try new approaches.
It is possible to simulate or to do low scale experiments, but to
validate such approaches necessitate much bigger databases.
My proposal would be to include a third option below the search box,
which would be to use an external search engine which would
communicate with wikipedia in order to provide search results - the
communication would allow wikipedia to control what is happening in
order to avoid problems (from latency to spam).
The search box would allow a user to use either a "random" search
engine, or to use one that could be set in the preferences.
I would suggest the randomness to be not so random, in the sense that
it should favour good search engines over bad one - hence the title
"Darwinian search". That would improve the special search box quality
over time, while stimulating research in my area.
I think it would also be beneficial for wikipedia, since
1) it distributes the search load to other back ends
2) it would improve search quality (and may change the way people use
wikipedia) and may be included as a default by wikipedia in the longer
term
3) it does not cost much - once the API and the main means to ensure
quality are set, the system will work by itself
I do not develop more here, since I first want to know if there is
some interest.
Best regards,
Benjamin Piwowarski (University of Glasgow, UK)
More information about the foundation-l
mailing list