On Tue, May 21, 2002 at 10:38:18AM -0700, Brion L. VIBBER wrote:
On mar, 2002-05-21 at 06:28, Kurt Jansson wrote:
I have set up the test site at http://test-de.wikipedia.com/,
Thanks Jason! I informed the German Wikipedians and hope they'll start playing around with it soon.
I wrote down some bugs I found on http://test-de.wikipedia.com/wiki/wikipedia:Beobachtete+Fehler
But that's in German, and I don't know anybody else but Magnus will understand it. So shell I post a (bad) translated report to wikitech-l from time to time?
- The search for words with umlauts doesn't seem to work. The script
doesn't find them,
Hmm, looks like the search needs another drubbing. I believe it's chopping up words at the boundaries of non-ASCII characters; if your word is big enough on either end (übersetzung, terroranschläge) this does a good enough job anyway, but it's hardly the way it should work!
It doesn't (unless you have changed this in my code, I didn't check). The parsing function usses the PERL regular expression \w to decide if something is legal in a search word or not. If it thinks a character is illegal it gives an error so you should have noticed that. If the result is empty and you didn't get a syntax error there can be two problems: - MySQL doesn't index that character. I wasn't able to find out which characters are exactly indexed by the full-text index and which not. I know that characters with umlauts get indexed (search for "G"odel" for example) but I wouldn't know about the ringel-S (sp?) for example. The easiest way to find out is probably simply trying. - The special characters in the articles were written by using entities. The search doesn't know that "o and ö are the same.
Btw, if you have questions about the parse function for the search, just ask. I'm very busy at the moment so I don't have time to do any real programming, but answering a few questions should not be a problem.
-- Jan Hidders