SUMMARY: The Search Platform team (formerly part of Discovery) is planning
to fix a long-standing search bug on many wiki projects by disabling the
code in CirrusSearch that re-uses the “fallback” languages (which are
specified for user interface or system messages) for the language analysis
modules (which are used to index words in search). Deployment is planned to
start the week of October 9, 2017.
Messaging fallbacks specify what language to show a message in when there
is no message available in the language of a given wiki. A language
analysis module is language-specific software that processes text to
improve searching—so that, for example, searching for a given word will
find related forms of that word, like "hope, hopes, hoping, hoped" or
"resume, resumé, résumé" on English-language wikis.
Fallback languages for system messages make sense for historical and
cultural reasons—a reader of the Chechen Wikipedia is more likely to
understand a user interface or system message in Russian than in French,
Greek, Hindi, Italian, or Japanese—but the fallbacks don't necessarily make
any linguistic sense. Chechen and Russian, for example, are from unrelated
language families; while the languages have undoubtedly influenced one
another, their grammars are completed different.
We will deploy the software change that disables using messaging fallbacks
for language analysis fallbacks in about two weeks (targeting the week of
October 9, 2017), with any cross-language analysis exceptions explicitly
configured in a new manner. Changes will not immediately happen to all
affected wikis because each wiki in each language will need to be
re-indexed, which is a separate process that takes time. There may also be
other delays caused by Elasticsearch upgrades or other changes that need
You can also track progress of the tasks on Phabricator or read more,
see examples, and get the full list of languages affected on MediaWiki.
Sr. Software Engineer, Search Platform