Hi Robert, Thanks a bunch for your prompt reply. I have changed the global conf file, and re-run the build, but Japanese characters are still repeated. Here is what my lsearch-global.conf file looks like: ------------------ [Database] wikidb: (single) (spell,4,2) (language,ja) [Search-Group] <my hostname> : wikidb wikidb.spell [Index] <my hostname> : * [Index-Path] <default> : /search ... ------------------
I know I can't search for single character, but I can ONLY search for 2 characters now. As mentioned earlier, because the repetition, searching for more than 2 characters (e.g. 3 characters) can not get matched. Any more ideas? Thanks again, Ross
--- On Wed, 4/14/10, Robert Stojnic rainmansr@gmail.com wrote:
From: Robert Stojnic rainmansr@gmail.com Subject: Re: [Mediawiki-l] Does Lucene-Search Support Japanese? To: "MediaWiki announcements and site admin list" mediawiki-l@lists.wikimedia.org Received: Wednesday, April 14, 2010, 11:13 PM
The repetition of characters is a known bug with highlighting, so you will need to disable it. As for searching, as I said you are stuck with only being able to search for two or more characters, not single characters because this is how text analysis currently works.
To turn off lucene highlighting use the following your global conf file:
[Search-Group] <put your host name here>: wikidb wikidb.spell
Cheers, r.
Ross Xu wrote:
Thank you, Robert. I found the problem ... Each Japanese character gets repeated in the search results. For example, if I search for "関数", I get this in the result list:
Showing below results 1 - 20 of 111 View (previous 20) (next 20) (20 | 50 | 100 | 250 | 500)
Checkers:BSTR.FUNC.LEN/ja SysStringLen 関数数ままたたは SysStringByteLen 関数数をを使使用用しして、BSTR 以外外のの文文字字列列のの長長ささをを取取得得ししよよううととししてていいまます。 これれららのの関関数数のの唯唯一一のの引引数数は、BSTR、CComBSTR、 … 669 B (160 words) - 13:42, 30 September 2009
Checkers:VOIDRET/ja void 型のの関関数数ととししてて宣宣言言さされれたた関関数数がが値値をを返返ししてていいまます。 脆弱弱性性ととリリススク : スタタイイルル関関連連のの問問題題でです。 例 : void A:foo()5 6 return 0; // 値が void 関数数かからら返返さされれる 7 … 444 B (91 words) - 20:17, 1 February 2010 ...
It doesn't repeat English letter or word. So, if I only search for 2 Japanese characters, I can get them from the result because the 2 characters can still be matched from the repeated characters. If I search for more than 2 characters, it can't find anything. What causes this problem, and how to fix? Thanks again, Ross
--- On Thu, 4/8/10, Robert Stojnic rainmansr@gmail.com wrote:
From: Robert Stojnic rainmansr@gmail.com Subject: Re: [Mediawiki-l] Does Lucene-Search Support Japanese? To: "MediaWiki announcements and site admin list" mediawiki-l@lists.wikimedia.org Received: Thursday, April 8, 2010, 2:15 AM
The config file looks good. To further debug steps:
- make sure the search you are making is showing up in the
lucene-search log, if you just start the deamon with ./lsearchd that would be the console 2) make sure you have all MWSearch that matches your MediaWiki version 3) note that you cannot search for a single japanese character, but only for 2 or more
r.
Ross Xu wrote:
Thanks for your reply, Robert. But I did rebuild the index (./build) after changing (language,en) to (language,ja). It still doesn't work. There is no problem with searching any English keywords.
You mentioned "add (language,ja)". Did you mean to ADD (language,ja) besides (language,en)? The entry is like this in my lsearch-global.conf file: [Database] wikidb : (single) (spell,4,2) (language,ja) Any more ideas? Thanks again, Ross
--- On Wed, 4/7/10, Robert Stojnic rainmansr@gmail.com wrote:
From: Robert Stojnic rainmansr@gmail.com Subject: Re: [Mediawiki-l] Does Lucene-Search Support Japanese? To: "MediaWiki announcements and site admin list" mediawiki-l@lists.wikimedia.org Received: Wednesday, April 7, 2010, 6:50 PM
There is a rather limited support, but it does work (e.g. see ja.wikipedia.org). Don't forget to rebuild your index (./build) after you add (language,ja).
r.
Ross Xu wrote:
Hi there, I am using Lucene-Search 2.1/MWSearch for my MediaWiki 1.15.1. It's working fine, but it can't search any Japanese characters. I have tried (language,ja) in the lsearch-global.conf file, but it doesn't seem to make any difference. Any idea would be appreciated, Ross
__________________________________________________________________ Looking for the perfect gift? Give the gift of Flickr!
http://www.flickr.com/gift/ _______________________________________________ MediaWiki-l mailing list MediaWiki-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
MediaWiki-l mailing list MediaWiki-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
__________________________________________________________________ Looking for the perfect gift? Give the gift of Flickr!
http://www.flickr.com/gift/ _______________________________________________ MediaWiki-l mailing list MediaWiki-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
MediaWiki-l mailing list MediaWiki-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com _______________________________________________ MediaWiki-l mailing list MediaWiki-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
_______________________________________________ MediaWiki-l mailing list MediaWiki-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
__________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com