Ray Saintonge wrote:
Mark Ryan wrote:
Google search wouldn't pick up orphans much either.
We should be aiming to improve our inbuilt search, rather than going back to using Google by default (yes, we did have all searches going through Google or Yahoo! for a long time there). Google's search has a number of critical flaws like inability to search for punctuation and automatic spell correction where no such correction is desired.
It would be nice to have boolean searching work on Wikipedia, but that's a topic for another mailing list entirely.
Boolean searching is only a part of it. Editors can often need certain types of search sophistication that may not be useful to the casual reader, which is not to say that I would keep casual readers away from it. As one example, a Wiktionary editor might want to look through Wikisource or Wikiquote to look for good examples of a word's usage, or to develop a statistical analysis about the way an author uses his words.
Ec
There's still plenty to be done just to get a simple text search working:
* Higher score for unstemmed matches * Better score normalisation w.r.t. document size * Improved case folding * Accent stripping * Chinese, Japanese and Thai word segmentation or n-gram search * Better scalability
User:Rainman-sr has been talking about doing a masters thesis on this subject, so anyone who wants to work on this should talk with him first, to avoid any duplication of effort.
-- Tim Starling