On Tue, May 21, 2002 at 10:38:18AM -0700, Brion L. VIBBER wrote:
On mar, 2002-05-21 at 06:28, Kurt Jansson wrote:
I have
set up the test site at
http://test-de.wikipedia.com/,
Thanks Jason! I informed the German Wikipedians and hope they'll start
playing around with it soon.
I wrote down some bugs I found on
http://test-de.wikipedia.com/wiki/wikipedia:Beobachtete+Fehler
But that's in German, and I don't know anybody else but Magnus will
understand it. So shell I post a (bad) translated report to wikitech-l
from time to time?
* The search for words with umlauts doesn't seem to work. The script
doesn't find them,
Hmm, looks like the search needs another drubbing. I believe it's
chopping up words at the boundaries of non-ASCII characters; if your
word is big enough on either end (übersetzung, terroranschläge) this
does a good enough job anyway, but it's hardly the way it should work!
It doesn't (unless you have changed this in my code, I didn't check). The
parsing function usses the PERL regular expression \w to decide if something
is legal in a search word or not. If it thinks a character is illegal it
gives an error so you should have noticed that. If the result is empty and
you didn't get a syntax error there can be two problems:
- MySQL doesn't index that character. I wasn't able to find out which
characters are exactly indexed by the full-text index and which not. I know
that characters with umlauts get indexed (search for "G"odel" for example)
but I wouldn't know about the ringel-S (sp?) for example. The easiest way to
find out is probably simply trying.
- The special characters in the articles were written by using entities. The
search doesn't know that "o and ö are the same.
Btw, if you have questions about the parse function for the search, just
ask. I'm very busy at the moment so I don't have time to do any real
programming, but answering a few questions should not be a problem.
-- Jan Hidders