I haven't read the entire thread but it seems the extension I was
working on for Wiktionary would be relevant here:
http://www.mediawiki.org/wiki/Extension:DidYouMean
It normalizes article names and traps page creations, deletions, and
moves to maintain a database table of normalized titles.
At every page view and search request this database is queried (unless
already cached) and a list of similar titles is suggested to the user.
The English Wiktionary already has a way (using templates) to suggest
similar article titles. DidYouMean combines this hand-edited list with
its generated list and displays them in the manner expected on
Wiktionary.
I had already considered adding a subset of pattern matching to the
normalization for finding kinds of similar titles that normalization
alone wouldn't find. This would be essential for a Wikipedia solution.
Currently DidYouMean normalizes accented characters to unaccented
characters, strips Hebrew and Arabic vowels, normalizes Japanese
fullwidth and halfwidth characters to normal width, etc. It also
strips spaces, hyphens, apostrophes, periods etc.
Obviously the matching heuristics for Wikipedia would be different.
Possibly including word stemming and stoplists but possibly also
hand-coded rules in a special page.
It might even be possible to automate page disambiguation to some
degree using these methods.
Andrew Dunbar (hippietrail)