On 26/10/2007, Steve Bennett stevagewp@gmail.com wrote:
On 10/26/07, Rolf Lampa rolf.lampa@rilnet.com wrote:
Soundex.
The title variants, or, very often due to differencies in spelling, is an old problem which was solved a long time ago, long before computers came about. The (old) solution was based on the fact that sound comprises differencies in spelling etc, hence "Soundex" :
Heh. No. Soundex is awful. There might be something better by now, but not Soundex. Anything but that. In a previous job I briefly flirted with it to perform name matching but it (or the SQL Server implementation at least) is useless - it collapses any name down to 4 consonants, making Steve and Stove identical, for instance.
Anyway a Soundex-like tool might be useful to complement or improve searching, but the situation I'm describing here is when you know exactly what search terms you want to reach, but it's a lot of effort to create all those redirects.
There's been a better alternative to Soundex for many years called Metaphone. I think there's even several variants of it these days.
I did some tests with Soundex or Metaphone when I was developing my DidYouMean extension. It's not too hard to use a different normalization algorithm. I also tried angagrams and textonyms.
Andrew Dunbar (hippietrail)
Steve _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org http://lists.wikimedia.org/mailman/listinfo/wikitech-l