On Wed, Aug 28, 2013 at 3:37 PM, Paul Selitskas p.selitskas@gmail.comwrote:
Will it be set as the search backend further on Wikimedia projects?
Yes. I'm not sure when though.
Is there source code available for Elasticsearch on Gerrit?
Our plugin that interacts with Elasticsearch is called CirrusSearch and lives in gerrit here: https://gerrit.wikimedia.org/r/#/projects/mediawiki/extensions/CirrusSearch,dashboards/default https://gerrit.wikimedia.org/r/#/admin/projects/mediawiki/extensions/CirrusS... Elasticsearch lives in github here: https://github.com/elasticsearch/elasticsearch
Stemming doesn't work for some languages at all, thus searching exact matches only.
Stemming is done based on the language of the wiki. I expect only English stemming to work on mediawiki.org. Right now we use the default language analysers for all the languages that Elasticsearch supports out of the box ( http://www.elasticsearch.org/guide/reference/index-modules/analysis/lang-ana...) with some customizations for English. Languages that aren't better supported get a "default" analyser that doesn't do any stemming and splits on spaces. I expect we'll have to add build some more analysers in the future.
Nik