[Foundation-l] Automatic username transliteration for SUL
Stephanie Erin Daugherty
stephanie at sosdg.org
Thu Dec 21 03:50:33 UTC 2006
> Even "ko-ho-ba-fay-jo-muh" is better than ????????
>
> Of course, we could always take the approach of putting them in IPA,
> thus annoying everyone equally.
>
> -- Neil
>
I think the most important part of the objection to "foreign" character
sets is that many people on en in particular are unaware of what
facilities exist for dealing with them. As people in countries using
Latin character sets rarely see, and almost never have to work with
anything that's in another character set, they are usually unaware of
what tools exists for interoperating with them. This is especially true
when abusive users have deliberately taken advantage of this fact in
order to make the lives of administrators and of other Wikipedians as
difficult as possible, and it's also true for those handful of users
that will mix character sets to "look cool" at the inconvenience of others.
The fact that usernames in foreign character sets pose special technical
challenges for users unfamiliar with them, and that mainstream
multilingual support is especially lacking in applications with a Latin
language family ethnocentricity (in particular large number of Windows
applications and Windows itself, at least for en-US locale), means that
functions for working with usernames need to be looked at carefully.
One of the more obvious ways that this can be made to work is to make
numeric userids more visible and more useful for various operations
where a username may be near impossible to type, and may even be
difficult to see. I'd strongly recommend this for usernames using
characters outside the ranges that a typical user on a given project
will be able to enter with a "normal" input method. Displaying something
like User:???????? <#2352562> would help considerably for those
situations, but it needs to work consistently for things like accessing
talk pages, accessing contributions, accessing userpages, and accessing
logs.
"Nicknames" or aliases, as proposed by someone else in this thread would
also help - they could be constrained to the characters typically usable
on a given wiki. I'd also suggest that when showing a "non-native"
username, that we indicate clearly what character set or even what
language it's in, this will become even more useful once SUL is
implemented (hopefully it's going to be part of SUL anyway, dealing with
namespace collisions otherwise will be insane.)
Transliterations would be useful in some situations, but I'd suggest we
make this a display option - to be respectful of other cultures means
that we should respect their writing and their culture wherever
possible. IPA could be handled in the same way, and this would probably
be appreciated by at least a few people who would otherwise be confused
on how to pronounce a given name.
Would a format for usernames with foreign characters like
"User:???????? (Arabic) <#2352562>" really be so bad? (This would apply
equally to projects where latin characters might not be displayable).
-Stephanie
More information about the foundation-l
mailing list