Hi Everyone,
Back in May I started working on Cyrillic ↔︎ Latin transliteration of
Crimean Tatar (Phab ticket T23582 <https://phabricator.wikimedia.org/T23582>)
as a hackathon/10% project. Since August I've had a working transliterator,
based on a 2009 draft from user DonAlessandro, who provided all the
language smarts for the transliteration.
Using parallel corpora and recent speaker feedback (also from
DonAlessandro!) I was able to assess and refine the quality of the
transliteration
<https://www.mediawiki.org/wiki/User:TJones_(WMF)/Notes/Crimean_Tatar_Transliteration#Parallel_Corpora,_Round_2>:
99.6% from Latin to Cyrillic, and 97.4% in the other direction.
In October, still without any code review, I asked my colleagues on the WMF
Search Platform team to do reviews for generic code quality. They made
several useful suggestions and I've made the changes and submitted them.
The most up-to-date patch is on Gerrit
<https://gerrit.wikimedia.org/r/#/c/372479>.
Since there was some discussion about changes to language converters here a
few days ago, I was hoping someone here could help me figure out who to
ask, or what to do to get code review from someone who is familiar with
language converters and has +2 rights on mediawiki/core.
The ticket was about seven and a half years old when I started looking into
it, and it will be eight years old this coming Monday! It'd be great to get
it some code review for its birthday, and even better for the Crimean Tatar
community to have working transliteration.
Thanks!
—Trey
Trey Jones
Sr. Software Engineer, Search Platform
Wikimedia Foundation