On 4/13/06, Steve Bennett stevage@gmail.com wrote:
That would be nice, but even the simple mechanism of exact matches would be a start. And then you can add fall backs, like all upper case, all lower case, upper case first letter of each word and so on. If performance is the issue here.
All upper/lower isn't really effective on Wikipedia because most of our multiword titles are mixed case.
AFAIR, most string matches in mysql are case insensitive, which would mean that we could have indexed case insensitive matches quickly... but I'm guessing that our use of binary fields for titles (which is required because no version of mysql has complete UTF-8 support) most likely breaks that.
Alas, mysql doesn't have functional indexes (http://bugs.mysql.com/bug.php?id=4990) or it would be fairly trivial to offer a fast case-insensitive match.