Hey everyone,
In case you haven't seen it already, I wrote a blog post about "unpacking" and updating our default language analyzers used for search. It's a project (made of many little projects) that I've been working on over the last year or two. The blog post is a review of the project and some of the fun language facts and computational complexities I've encountered.
https://diff.wikimedia.org/2023/04/28/language-harmony-and-unpacking-a-year-...
Hope you enjoy it.* —Trey
Trey Jones Staff Computational Linguist, Search Platform Wikimedia Foundation UTC–4 / EDT
_____ * Read the footnotes!