[Foundation-l] Darwinian search on wikipedia

Benjamin Piwowarski benjamin at bpiwowar.net
Tue May 26 11:00:31 UTC 2009


I hope this is the right list to post this email - otherwise I would  
appreciate being directed to the right one.

Shortly, I would like to promote a project for opening the search box  
to external entities. My main motivation would be shared by many  
researchers in interactive information retrieval (IIR): In order to  
run experiments about new techniques in IIR, it is necessary to  
evaluate them, and hence to have enough users to try new approaches.  
It is possible to simulate or to do low scale experiments, but to  
validate such approaches necessitate much bigger databases.

My proposal would be to include a third option below the search box,  
which would be to use an external search engine which would  
communicate with wikipedia in order to provide search results - the  
communication would allow wikipedia to control what is happening in  
order to avoid problems (from latency to spam).

The search box would allow a user to use either a "random" search  
engine, or to use one that could be set in the preferences.

I would suggest the randomness to be not so random, in the sense that  
it should favour good search engines over bad one - hence the title  
"Darwinian search". That would improve the special search box quality  
over time, while stimulating research in my area.

I think it would also be beneficial for wikipedia, since
1) it distributes the search load to other back ends
2) it would improve search quality (and may change the way people use  
wikipedia) and may be included as a default by wikipedia in the longer  
3) it does not cost much - once the API and the main means to ensure  
quality are set, the system will work by itself

I do not develop more here, since I first want to know if there is  
some interest.

Best regards,
Benjamin Piwowarski (University of Glasgow, UK)

More information about the foundation-l mailing list