[Foundation-l] [Wikitech-l] Primary account for single user login
Tim Starling
tstarling at wikimedia.org
Wed Apr 9 14:24:50 UTC 2008
Simetrical wrote:
> On Wed, Apr 9, 2008 at 9:41 AM, Tim Starling <tstarling at wikimedia.org> wrote:
>> It supports custom rule sets, and Wikipedians may some day submit such a
>> rule set that can transliterate Japanese and Korean names.
>
> Surely the issue with Japanese and Korean is that they use the same
> code points as Chinese. The character 日 is transliterated "hi" in
> Japanese, but "rì" in Chinese, in each case meaning "day". It can't
> possibly guess which one to translate to, so presumably it must choose
> one of them arbitrarily. A custom rule set won't help that. Of
> course, in some cases it's presumably not ambiguous whether a string
> is Chinese/Japanese/Korean; in those cases it could hypothetically
> make a guess and follow it consistently, which apparently it doesn't
> (someone gave an example above of something that was transliterated
> partly to romaji and partly to pinyin).
You seem to be implying that the transliterator cannot possibly know what
language the name is in. MediaWiki can supply that information. It can
select a rule set based on the current user language.
Rules are not limited to single-character context-free conversions. Please
read the "Designing Transliterators" section of:
http://www.icu-project.org/userguide/Transform.html
The character "日" rarely appears alone, rather it appears as a part of
larger lexical units which, when identified, usually have unambiguous
pronunciation.
-- Tim Starling
More information about the wikimedia-l
mailing list