It looks like a lot of the pieces needed to make this happen are out there.
Unfortunately it doesn't look like a one-to-one transliteration based on the description in English Wikipedia.[1] But when is language ever straightforward?
It looks like much of the work to deal with all the contextual variation and the exceptions to the transliteration was at least attempted twice. There's a zip file of code attached to the Phab Ticket,[2] and link to some code on-wiki[5]. From the comments, it looks like that code never quite worked, but it seems possible to harvest the conversion data from one or both and put it into the same format as the other existing language converters, like Kazakh[3]—and it *might* be easier this time since it's been 6.5 years and the LanguageConverter code is probably more mature now.
It would be even better if someone could create an Elasticsearch plugin to do the same kind of conversion. That would allow cross-alphabet searching, too. I've been working with a plugin[4] that does that kind of thing for Traditional and Simplified Chinese.
—Trey
[1] https://en.wikipedia.org/wiki/Crimean_Tatar_alphabet#Cyrillic_to_Latin_trans... [2] https://phabricator.wikimedia.org/T23582#247642 [3] https://doc.wikimedia.org/mediawiki-core/master/php/classKkConverter.html [4] https://github.com/medcl/elasticsearch-analysis-stconvert [5] https://phabricator.wikimedia.org/T23582#247634
Trey Jones Software Engineer, Discovery Wikimedia Foundation
On Fri, Mar 24, 2017 at 5:40 AM, Vira Motorko vira.motorko@gmail.com wrote:
[I'm sorry if it's not the place to ask, please forward where it should be.]
Hi all,
There is a long frozen idea: to make a transliterator for Crimean Tatar Wikipedia. Native speakers of crh use both cyrillic and latin script depending on the country they used to live in. One example of similar thing in use is https://kk.wikipedia.org — one can choose in what script they see the content.
There is an old task on Phabricator and were attempts to write a tool in php but the effort stopped. https://phabricator.wikimedia.org/T23582 https://phabricator.wikimedia.org/T23582
Maybe someone can/wants to help with this tool or create one from scratch? Maybe you know where else I can find help?
Thanks! *--* *Vira Motorko* project manager, Wikimedia Ukraine https://ua.wikimedia.org/ non-profit organisation m: +380667740499 | f: vira.motorko https://www.facebook.com/vira.motorko | w: Ата https://meta.wikimedia.org/wiki/User:Ата
Are you saving your documents in free formats? ;) Help save natural resources – please think twice before printing this e-mail or any attachments. _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l