David Causse, 05/01/2017 09:36:
Indeed from time to time I have to read lsearch2 code
to understand what
was done before cirrus was deployed.
:)
Concerning Russian I think we do, apparently lsearchd
used a simple
wrapper to the lucene russian stemmer [1]. If there are some other
custom code or if you are aware of some regressions I'd appreciate some
links so we can track them. I remember having seen some code (js
gadgets?) that does some custom russian stemming...
I remember seeing some file with long lists of rules for Cyrillic, but
maybe it was SerbianFilter.java .
Concerning Hebrew I hope we can find a good analyzer, according to the
comments in the code the hebrew analyzer that was tested appeared to be
unstable and was disabled.
Ah, makes sense. Thanks!
Nemo