Jimbo, the tricks that you mention (weighing titles more than text, weighing rare words more than common ones, stopwords etc.) are already used by mysql's fulltext index; I don't think it would make sense to reinvent the wheel, especially since we now have real-time searching and boolean searches, both of which are kind of nice. I am also pretty sure that the mysql index is reasonably fast, being written in C. They explain a bit about it at http://www.mysql.com/doc/F/u/Fulltext_Search.html
Regarding the three letter limit: since we right now already parse the search string for AND, OR and NOT anyway, it should be pretty easy to remove short words from the search string, start the query without them, and then later report the results with a warning like The short word "the" was ignored.
Axel
wikitech-l@lists.wikimedia.org