One cute trick that I have often used is to calculate the
"uselessness" of a particular word. A word is semantically more
useless if it appears more often. This has really dramatic empirical
results for the better, especially on small datasets. (Maybe on
really big ones, too, but I've never played with those.)
Thus if someone searches for 'John Malkovich' they get a good result,
because 'John' is not weighted so heavily -- it's a more useless word
because it appears more often in the search set. But 'Malkovich', now
you're talking, there's a word that _means something_.