Re: [discovery] This quarter: researching new language analysers for search

5 Jan 2017


      David Causse, 05/01/2017 09:36:
...
Indeed from time to time I have to read lsearch2 code to understand what
was done before cirrus was deployed.
:)
...
Concerning Russian I think we do, apparently lsearchd used a simple
wrapper to the lucene russian stemmer [1]. If there are some other
custom code or if you are aware of some regressions I'd appreciate some
links so we can track them. I remember having seen some code (js
gadgets?) that does some custom russian stemming...
I remember seeing some file with long lists of rules for Cyrillic, but 
maybe it was SerbianFilter.java .
...
Concerning Hebrew I hope we can find a good analyzer, according to the
comments in the code the hebrew analyzer that was tested appeared to be
unstable and was disabled.
Ah, makes sense. Thanks!
Nemo

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

Re: [discovery] This quarter: researching new language analysers for search