My point here is (in the long term): maybe it's difficult to build good suggestions from data directly so why not build a custom dictionary/index to handle "Did you mean suggestions"?. According to https://www.youtube.com/watch?v=syKY8CrHkck#t=22m03s they learn from search queries to build these suggestions. Is this something worth trying?

Definitely worth trying. Erik's got a patch in for adding a session id ( https://gerrit.wikimedia.org/r/#/c/226466/ ). In addition to identifying prefix searches that come right before the user finishes typing their full text query, this would be good for looking for zero-results (or even low-results) queries followed by a similarly-spelled successful query from the same search session. "saerch" followed by "search" gives us hints that the latter is a good suggestion for the former—especially if it happens a lot.
 

figured out that session id as written in that patch isn't going to work, brainstorming in https://phabricator.wikimedia.org/T106552 about ways we can get this information without generating sessions for the user.

Erik B