On Wed, Aug 28, 2013 at 3:37 PM, Paul Selitskas <p.selitskas(a)gmail.com>wrote;wrote:
Will it be set as the search backend further on
Wikimedia projects?
Yes. I'm not sure when though.
Is there source code available for Elasticsearch on
Gerrit?
Our plugin that interacts with Elasticsearch is called CirrusSearch and
lives in gerrit here:
<https://gerrit.wikimedia.org/r/#/projects/mediawiki/extensions/CirrusSearch,dashboards/default>
https://gerrit.wikimedia.org/r/#/admin/projects/mediawiki/extensions/Cirrus…
Elasticsearch lives in github here:
https://github.com/elasticsearch/elasticsearch
Stemming doesn't work for some languages at all,
thus
searching exact matches only.
Stemming is done based on the language of the wiki. I expect only English
stemming to work on
mediawiki.org. Right now we use the default language
analysers for all the languages that Elasticsearch supports out of the box (
http://www.elasticsearch.org/guide/reference/index-modules/analysis/lang-an…)
with some customizations for English. Languages that aren't better
supported get a "default" analyser that doesn't do any stemming and splits
on spaces. I expect we'll have to add build some more analysers in the
future.
Nik