In general I am a strong believer of "let's start with the simple thing", which is to let editors add transliterations (that is why we have a label field for every entity in every language).
I may see a use case for a transliteration-bot that does some of the transliterations (semi?)automatically, but I actually would think that this is probably something that should be left to the community.
There might be some simple cases for language fallbacks (including transliterations) but we have not touched that development item yet. We have to see how this works out.
But in short, I am wary of automatic systems and rather would count on the knowledge of the editors.
I hope that makes sense, Cheers, Denny
2012/8/14 Amir E. Aharoni amir.aharoni@mail.huji.ac.il:
2012/8/14 Nikola Smolenski smolensk@eunet.rs:
On 14/08/12 08:57, Amir E. Aharoni wrote:
2012/8/14 Nikola Smolenskismolensk@eunet.rs:
I believe it should be possible to alleviate this problem to an extent by introducing automatic transcription between languages and specifying what language the mayor's "default" name is in. If automatic transcription gets it wrong, it could still be overriden when someone enters the name in another language.
It is guaranteed to be profoundly broken. The above-mentioned Hebrew names will be transliterated as<'mrm mcn'> (the apostrophes are part
Would it? How many Hebrew names are there that are spelled "עמרם"? If the transliteration software knows it's a human name it can transliterate it as "Amram".
What you say is kinda true, but in practice it's much more complicated. I worked for a few years in a company that makes software that does this and I was the lead developer. There are two software packages that do it for Hebrew, they are proprietary and very expensive. It's not that making a Free package is impossible, but you need a team for every language that has such problems, you need several full time people to maintain the words, and what's worst is that most words have six or so possible pronunciations. Sure, crowdsourcing in Wikidata may change that, but it's too early to talk about this.
AFAIK the situation is even worse in Arabic, which is a much bigger language than Hebrew.
What I'm getting at is, again, that some limited helping transliteration may be OK, but it must not be automatically propagated. Naïve people may think that that's how the name is actually written, and in such matters most people are very naïve.
-- Amir
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l