Hi,
We have developed a fast similarity search algorithm (FastSS) for keyword search and we have used English Wikipedia articles to test it. The similarity metric is the edit distance between words, which is language independent. The result is displayed according to the occurrence and edit distance. The website (http://fastss.csg.uzh.ch/) has a demo of our prototype.
The indexing of the complete English Wikipedia takes ~3 days. We are still working on improving the performance, solving issues with umlauts, improving the rendering of the output page and integrating the title in the indexing phase.
Should we work towards a mediawiki integration of FastSS? Is it of interest to you to include FastSS into wikimedia? Any comments are welcome.
Regards,
Thomas Bocek