Thanks Robert. It's nice to know. Is there any plan to support database prefixes in the future? I am just curious why it does not repeat English words or letter, and just repeat Japanese characters. If there were no Japanese characters in my wiki sites, my configurations would work perfectly. BTW, what does it look like in your "two lines in the [Database] section instead of one"? Thanks again, Ross
--- On Thu, 4/15/10, Robert Stojnic rainmansr@gmail.com wrote:
From: Robert Stojnic rainmansr@gmail.com Subject: Re: [Mediawiki-l] Does Lucene-Search Support Japanese? To: "MediaWiki announcements and site admin list" mediawiki-l@lists.wikimedia.org Received: Thursday, April 15, 2010, 6:34 AM
Lucene-search currently doesn't support database prefixes and i imagine this is why your hacked-up setup doesn't work. The extension is designed so that all of the wikis are can be indexed and searched by a single daemon, you just need to merge the appropriate sections (i.e. have two lines in the [Database] section instead of one). A way to go for you might be to separately dump both of wikis and then import them with different names, but lucene-search is not designed to run this kind of the setup out of the box.
r.
Ross Xu wrote:
Hi there, I don't think it's related, but this wiki site is a second one in the same machine. The two wiki sites share the same MySQL database with different prefixes, and share the same MediaWiki source using symbolic links. The first one (called wiki1) is working well with Lucene-Search 2.1/MWSearch for searching Japanese without any problem. I even make use of (language,en), instead of (language,ja) in the global conf file. The second one (called wiki2) is using Lucene-Search 2.1/MWSearch in different ports (Search.port=8124, and Index.port=8322). It's working well with searching any English keywords, but for searching Japanese characters, it gets each Japanese character repeated in the snippets. And I have to use (language,ja), otherwise, I couldn't get any Japanese search result at all. Even though I kill all lsearchd daemons, re-do the index (build) for wiki2, restart the lsearchd for wiki2, it's still the same thing. Any idea is appreciated, Ross
--- On Wed, 4/14/10, Ross Xu rossxunix@yahoo.ca wrote:
From: Ross Xu rossxunix@yahoo.ca Subject: Re: [Mediawiki-l] Does Lucene-Search Support Japanese? To: "MediaWiki announcements and site admin list" mediawiki-l@lists.wikimedia.org Received: Wednesday, April 14, 2010, 11:52 PM
Hi Robert, Thanks a bunch for your prompt reply. I have changed the global conf file, and re-run the build, but Japanese characters are still repeated. Here is what my lsearch-global.conf file looks like:
[Database] wikidb: (single) (spell,4,2) (language,ja) [Search-Group] <my hostname> : wikidb wikidb.spell [Index] <my hostname> : * [Index-Path] <default> : /search ...
I know I can't search for single character, but I can ONLY search for 2 characters now. As mentioned earlier, because the repetition, searching for more than 2 characters (e.g. 3 characters) can not get matched. Any more ideas? Thanks again, Ross
--- On Wed, 4/14/10, Robert Stojnic rainmansr@gmail.com wrote:
From: Robert Stojnic rainmansr@gmail.com Subject: Re: [Mediawiki-l] Does Lucene-Search Support Japanese? To: "MediaWiki announcements and site admin list" mediawiki-l@lists.wikimedia.org Received: Wednesday, April 14, 2010, 11:13 PM
The repetition of characters is a known bug with highlighting, so you will need to disable it. As for searching, as I said you are stuck with only being able to search for two or more characters, not single characters because this is how text analysis currently works.
To turn off lucene highlighting use the following your global conf file:
[Search-Group] <put your host name here>: wikidb wikidb.spell
Cheers, r.
Ross Xu wrote:
Thank you, Robert. I found the problem ... Each Japanese character gets repeated in the search results. For example, if I search for "関数", I get this in the result list:
Showing below results 1 - 20 of 111 View (previous 20) (next 20) (20 | 50 | 100 | 250 | 500)
Checkers:BSTR.FUNC.LEN/ja SysStringLen 関数数ままたたは SysStringByteLen 関数数をを使使用用しして、BSTR 以外外のの文文字字列列のの長長ささをを取取得得ししよよううととししてていいまます。 これれららのの関関数数のの唯唯一一のの引引数数は、BSTR、CComBSTR、 … 669 B (160 words) - 13:42, 30 September 2009
Checkers:VOIDRET/ja void 型のの関関数数ととししてて宣宣言言さされれたた関関数数がが値値をを返返ししてていいまます。 脆弱弱性性ととリリススク : スタタイイルル関関連連のの問問題題でです。 例 : void A:foo()5 6 return 0; // 値が void 関数数かからら返返さされれる 7 … 444 B (91 words) - 20:17, 1 February 2010 ...
It doesn't repeat English letter or word. So, if I only search for 2 Japanese characters, I can get them from the result because the 2 characters can still be matched from the repeated characters. If I search for more than 2 characters, it can't find anything. What causes this problem, and how to fix? Thanks again, Ross
--- On Thu, 4/8/10, Robert Stojnic rainmansr@gmail.com wrote:
From: Robert Stojnic rainmansr@gmail.com Subject: Re: [Mediawiki-l] Does Lucene-Search Support Japanese? To: "MediaWiki announcements and site admin list" mediawiki-l@lists.wikimedia.org Received: Thursday, April 8, 2010, 2:15 AM
The config file looks good. To further debug steps:
- make sure the search you are making is showing up in the
lucene-search log, if you just start the deamon with ./lsearchd that would be the console 2) make sure you have all MWSearch that matches your MediaWiki version 3) note that you cannot search for a single japanese character, but only for 2 or more
r.
Ross Xu wrote:
Thanks for your reply, Robert. But I did rebuild the index (./build) after changing (language,en) to (language,ja). It still doesn't work. There is no problem with searching any English keywords.
You mentioned "add (language,ja)". Did you mean to ADD (language,ja) besides (language,en)? The entry is like this in my lsearch-global.conf file: [Database] wikidb : (single) (spell,4,2) (language,ja) Any more ideas? Thanks again, Ross
--- On Wed, 4/7/10, Robert Stojnic rainmansr@gmail.com wrote:
From: Robert Stojnic rainmansr@gmail.com Subject: Re: [Mediawiki-l] Does Lucene-Search Support Japanese? To: "MediaWiki announcements and site admin list" mediawiki-l@lists.wikimedia.org Received: Wednesday, April 7, 2010, 6:50 PM
There is a rather limited support, but it does work (e.g. see ja.wikipedia.org). Don't forget to rebuild your index (./build) after you add (language,ja).
r.
Ross Xu wrote:
Hi there, I am using Lucene-Search 2.1/MWSearch for my MediaWiki 1.15.1. It's working fine, but it can't search any Japanese characters. I have tried (language,ja) in the lsearch-global.conf file, but it doesn't seem to make any difference. Any idea would be appreciated, Ross
__________________________________________________________________ Looking for the perfect gift? Give the gift of Flickr!
http://www.flickr.com/gift/ _______________________________________________ MediaWiki-l mailing list MediaWiki-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
MediaWiki-l mailing list MediaWiki-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
__________________________________________________________________ Looking for the perfect gift? Give the gift of Flickr!
http://www.flickr.com/gift/ _______________________________________________ MediaWiki-l mailing list MediaWiki-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
MediaWiki-l mailing list MediaWiki-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com _______________________________________________ MediaWiki-l mailing list MediaWiki-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
MediaWiki-l mailing list MediaWiki-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com _______________________________________________ MediaWiki-l mailing list MediaWiki-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com _______________________________________________ MediaWiki-l mailing list MediaWiki-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
_______________________________________________ MediaWiki-l mailing list MediaWiki-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
__________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com