It looks like a lot of the pieces needed to make this happen are out there.
Unfortunately it doesn't look like a one-to-one transliteration based on
the description in English Wikipedia.[1] But when is language ever
straightforward?
It looks like much of the work to deal with all the contextual variation
and the exceptions to the transliteration was at least attempted twice.
There's a zip file of code attached to the Phab Ticket,[2] and link to some
code on-wiki[5]. From the comments, it looks like that code never quite
worked, but it seems possible to harvest the conversion data from one or
both and put it into the same format as the other existing language
converters, like Kazakh[3]—and it *might* be easier this time since it's
been 6.5 years and the LanguageConverter code is probably more mature now.
It would be even better if someone could create an Elasticsearch plugin to
do the same kind of conversion. That would allow cross-alphabet searching,
too. I've been working with a plugin[4] that does that kind of thing for
Traditional and Simplified Chinese.
—Trey
[1]
https://en.wikipedia.org/wiki/Crimean_Tatar_alphabet#Cyrillic_to_Latin_tran…
[2]
https://phabricator.wikimedia.org/T23582#247642
[3]
https://doc.wikimedia.org/mediawiki-core/master/php/classKkConverter.html
[4]
https://github.com/medcl/elasticsearch-analysis-stconvert
[5]
https://phabricator.wikimedia.org/T23582#247634
Trey Jones
Software Engineer, Discovery
Wikimedia Foundation
On Fri, Mar 24, 2017 at 5:40 AM, Vira Motorko <vira.motorko(a)gmail.com>
wrote:
[I'm sorry if it's not the place to ask,
please forward where it should
be.]
Hi all,
There is a long frozen idea: to make a transliterator for Crimean Tatar
Wikipedia. Native speakers of crh use both cyrillic and latin script
depending on the country they used to live in.
One example of similar thing in use is
https://kk.wikipedia.org — one can
choose in what script they see the content.
There is an old task on Phabricator and were attempts to write a tool in
php but the effort stopped.
https://phabricator.wikimedia.org/T23582
<https://phabricator.wikimedia.org/T23582>
Maybe someone can/wants to help with this tool or create one from scratch?
Maybe you know where else I can find help?
Thanks!
*--*
*Vira Motorko*
project manager, Wikimedia Ukraine <https://ua.wikimedia.org/> non-profit
organisation
m: +380667740499 | f: vira.motorko <https://www.facebook.com/vira.motorko>
|
w: Ата <https://meta.wikimedia.org/wiki/User:Ата>
Are you saving your documents in free formats? ;)
Help save natural resources – please think twice before printing this
e-mail or any attachments.
_______________________________________________
Wikitech-l mailing list
Wikitech-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l