On Fri, Aug 16, 2002 at 08:30:42PM -0700, lcrocker(a)nupedia.com wrote:
That being said, if MySQL now produces more useful results in non-
boolean mode, I'm certainly open to letting it do that; but I was
under the impression that the booleans were really needed.
I just tried if I could reproduce the bad scoring results of MySQL that made
me AND the search words, and I couldn't. I'm running now a slightly newer
and different version (3.23.41 on Linux) than when I did those tests
originally (3.23.?? on Windows), but perhaps I also did not do enough
testing and accidentally used some stopwords or words that didn't make the
50% limit.
Anyway, I don't have SQL access so if somebody could do some testing on the
currently running MySQL server to see how well it ranks for phrases like
'axel boldt'. In my memory, pages with a lot of 'axel' on it were higher
ranked than pages that contained both 'axel' and 'boldt' only once. The
only
thing that happens now is that pages with only 'axel' on it also turn up in
the result.
As far as the need for boolean searches is concerned, I implemented it
because Larry Sanger asked for it and it seemed important at the time that he
had good searching capabilities. I doubt if it is used very much at the
moment.
I'm also planning to build a special
"advanced search" page where
it might be an option to do it either way.
Once we start using MySQL4 that could be as simple as a text box that allows
people to do the MySQL binary fulltext search along with some explanation of
how to do the boolean searches, the exact phrase search and the search where
the search words have different relevancies.
So that leaves only the stopword-list problem. Do we really want a separate
MySQL server for evey different language that is compiled with its own
stopword list?
-- Jan Hidders