David Causse, 05/01/2017 09:36:
Indeed from time to time I have to read lsearch2 code to understand what was done before cirrus was deployed.
:)
Concerning Russian I think we do, apparently lsearchd used a simple wrapper to the lucene russian stemmer [1]. If there are some other custom code or if you are aware of some regressions I'd appreciate some links so we can track them. I remember having seen some code (js gadgets?) that does some custom russian stemming...
I remember seeing some file with long lists of rules for Cyrillic, but maybe it was SerbianFilter.java .
Concerning Hebrew I hope we can find a good analyzer, according to the comments in the code the hebrew analyzer that was tested appeared to be unstable and was disabled.
Ah, makes sense. Thanks!
Nemo