On Wikipedia-l or something similar there has been discussion involving English Wikipedia's policy of blocking users with usernames that do not use the Latin alphabet. Reasons for opposition to this practice include ethnocentricism and messing up SUL. However, a point was raised in that people unfamiliar with the script will just see it as a bunch of squiggly-lines. A suggested remedy was having people transliterate their names depending on the wiki.
I'm interested in combining this with a script similar to the Automatic Conversion script employed on the Chinese Wikipedia, that would, combined with SUL, automatically transliterate usernames contingent on the wiki they are on. For example, on French Wikipedia, your username would be in Latin script, whereas on the Hebrew Wikipedia your username would be in Hebrew script and on Arabic Wikipedia your username would be in Arabic script.
I know there are scripts out there that can transliterate -- in addition to the aforementioned conversion script on the Chinese Wikipedia, there are quite a few scripts out there that will allow you to input something and have it output in a different script.
http://vereb.free.fr/transliteration/transliterator.html has a pretty cool one that allows you to type with Latin alphabet settings and it will output in a different script. For example, - MessedRocker becomes МесседРоцкер with the Cyrillic setting - MessedRocker becomes ΜεσσεδΡοcκερ with the Greek setting - MessedRocker becomes مِصِدرُطكِر with the Arabic setting.
If something could be created for SUL that would take a username and transliterate it depending on the language of the wiki, that would be great. I understand there are technical issues involved, but I would like to discuss it on a community level.
--James
James Hare wrote:
On Wikipedia-l or something similar there has been discussion involving English Wikipedia's policy of blocking users with usernames that do not use the Latin alphabet. Reasons for opposition to this practice include ethnocentricism and messing up SUL. However, a point was raised in that people unfamiliar with the script will just see it as a bunch of squiggly-lines. A suggested remedy was having people transliterate their names depending on the wiki.
I'm interested in combining this with a script similar to the Automatic Conversion script employed on the Chinese Wikipedia, that would, combined with SUL, automatically transliterate usernames contingent on the wiki they are on. For example, on French Wikipedia, your username would be in Latin script, whereas on the Hebrew Wikipedia your username would be in Hebrew script and on Arabic Wikipedia your username would be in Arabic script.
I know there are scripts out there that can transliterate -- in addition to the aforementioned conversion script on the Chinese Wikipedia, there are quite a few scripts out there that will allow you to input something and have it output in a different script.
http://vereb.free.fr/transliteration/transliterator.html has a pretty cool one that allows you to type with Latin alphabet settings and it will output in a different script. For example,
- MessedRocker becomes МесседРоцкер with the Cyrillic setting
- MessedRocker becomes ΜεσσεδΡοcκερ with the Greek setting
- MessedRocker becomes مِصِدرُطكِر with the Arabic setting.
If something could be created for SUL that would take a username and transliterate it depending on the language of the wiki, that would be great. I understand there are technical issues involved, but I would like to discuss it on a community level.
--James
If we can make it as simple as that (and acknowledge that the automatic transliteration will often be very, very, bad) we could possibly make this work, with the option to change your nick to something else later.
Perhaps language -> IPA -> language might be a good way of helping the m x n problem for this, in which case we could use language <-> IPA tables to bootstrap this.
Would this be a good idea, or would it be linguistic nonsense?
-- Neil
I don't know if IPA would work... I mean for all we know, "Bob" could have some random pronounciation of "ko-ho-ba-fay-jo-muh". When it comes to storing it in the database, I agree it should be some uniform script, like a Federation Standard of sorts...
On 12/20/06, Neil Harris usenet@tonal.clara.co.uk wrote:
James Hare wrote:
On Wikipedia-l or something similar there has been discussion involving English Wikipedia's policy of blocking users with usernames that do not
use
the Latin alphabet. Reasons for opposition to this practice include ethnocentricism and messing up SUL. However, a point was raised in that people unfamiliar with the script will just see it as a bunch of squiggly-lines. A suggested remedy was having people transliterate their names depending on the wiki.
I'm interested in combining this with a script similar to the Automatic Conversion script employed on the Chinese Wikipedia, that would,
combined
with SUL, automatically transliterate usernames contingent on the wiki
they
are on. For example, on French Wikipedia, your username would be in
Latin
script, whereas on the Hebrew Wikipedia your username would be in Hebrew script and on Arabic Wikipedia your username would be in Arabic script.
I know there are scripts out there that can transliterate -- in addition
to
the aforementioned conversion script on the Chinese Wikipedia, there are quite a few scripts out there that will allow you to input something and have it output in a different script.
http://vereb.free.fr/transliteration/transliterator.html has a pretty
cool
one that allows you to type with Latin alphabet settings and it will
output
in a different script. For example,
- MessedRocker becomes МесседРоцкер with the Cyrillic setting
- MessedRocker becomes ΜεσσεδΡοcκερ with the Greek setting
- MessedRocker becomes مِصِدرُطكِر with the Arabic setting.
If something could be created for SUL that would take a username and transliterate it depending on the language of the wiki, that would be
great.
I understand there are technical issues involved, but I would like to discuss it on a community level.
--James
If we can make it as simple as that (and acknowledge that the automatic transliteration will often be very, very, bad) we could possibly make this work, with the option to change your nick to something else later.
Perhaps language -> IPA -> language might be a good way of helping the m x n problem for this, in which case we could use language <-> IPA tables to bootstrap this.
Would this be a good idea, or would it be linguistic nonsense?
-- Neil
foundation-l mailing list foundation-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/foundation-l
James Hare wrote:
I don't know if IPA would work... I mean for all we know, "Bob" could have some random pronounciation of "ko-ho-ba-fay-jo-muh". When it comes to storing it in the database, I agree it should be some uniform script, like a Federation Standard of sorts...
Even "ko-ho-ba-fay-jo-muh" is better than ????????
Of course, we could always take the approach of putting them in IPA, thus annoying everyone equally.
-- Neil
Even "ko-ho-ba-fay-jo-muh" is better than ????????
Of course, we could always take the approach of putting them in IPA, thus annoying everyone equally.
-- Neil
I think the most important part of the objection to "foreign" character sets is that many people on en in particular are unaware of what facilities exist for dealing with them. As people in countries using Latin character sets rarely see, and almost never have to work with anything that's in another character set, they are usually unaware of what tools exists for interoperating with them. This is especially true when abusive users have deliberately taken advantage of this fact in order to make the lives of administrators and of other Wikipedians as difficult as possible, and it's also true for those handful of users that will mix character sets to "look cool" at the inconvenience of others.
The fact that usernames in foreign character sets pose special technical challenges for users unfamiliar with them, and that mainstream multilingual support is especially lacking in applications with a Latin language family ethnocentricity (in particular large number of Windows applications and Windows itself, at least for en-US locale), means that functions for working with usernames need to be looked at carefully.
One of the more obvious ways that this can be made to work is to make numeric userids more visible and more useful for various operations where a username may be near impossible to type, and may even be difficult to see. I'd strongly recommend this for usernames using characters outside the ranges that a typical user on a given project will be able to enter with a "normal" input method. Displaying something like User:???????? <#2352562> would help considerably for those situations, but it needs to work consistently for things like accessing talk pages, accessing contributions, accessing userpages, and accessing logs.
"Nicknames" or aliases, as proposed by someone else in this thread would also help - they could be constrained to the characters typically usable on a given wiki. I'd also suggest that when showing a "non-native" username, that we indicate clearly what character set or even what language it's in, this will become even more useful once SUL is implemented (hopefully it's going to be part of SUL anyway, dealing with namespace collisions otherwise will be insane.)
Transliterations would be useful in some situations, but I'd suggest we make this a display option - to be respectful of other cultures means that we should respect their writing and their culture wherever possible. IPA could be handled in the same way, and this would probably be appreciated by at least a few people who would otherwise be confused on how to pronounce a given name.
Would a format for usernames with foreign characters like "User:???????? (Arabic) <#2352562>" really be so bad? (This would apply equally to projects where latin characters might not be displayable).
-Stephanie
Stephanie Erin Daugherty schreef:
Even "ko-ho-ba-fay-jo-muh" is better than ????????
Of course, we could always take the approach of putting them in IPA, thus annoying everyone equally.
-- Neil
I think the most important part of the objection to "foreign" character sets is that many people on en in particular are unaware of what facilities exist for dealing with them. As people in countries using Latin character sets rarely see, and almost never have to work with anything that's in another character set, they are usually unaware of what tools exists for interoperating with them. This is especially true when abusive users have deliberately taken advantage of this fact in order to make the lives of administrators and of other Wikipedians as difficult as possible, and it's also true for those handful of users that will mix character sets to "look cool" at the inconvenience of others.
The fact that usernames in foreign character sets pose special technical challenges for users unfamiliar with them, and that mainstream multilingual support is especially lacking in applications with a Latin language family ethnocentricity (in particular large number of Windows applications and Windows itself, at least for en-US locale), means that functions for working with usernames need to be looked at carefully.
One of the more obvious ways that this can be made to work is to make numeric userids more visible and more useful for various operations where a username may be near impossible to type, and may even be difficult to see. I'd strongly recommend this for usernames using characters outside the ranges that a typical user on a given project will be able to enter with a "normal" input method. Displaying something like User:???????? <#2352562> would help considerably for those situations, but it needs to work consistently for things like accessing talk pages, accessing contributions, accessing userpages, and accessing logs.
"Nicknames" or aliases, as proposed by someone else in this thread would also help - they could be constrained to the characters typically usable on a given wiki. I'd also suggest that when showing a "non-native" username, that we indicate clearly what character set or even what language it's in, this will become even more useful once SUL is implemented (hopefully it's going to be part of SUL anyway, dealing with namespace collisions otherwise will be insane.)
Transliterations would be useful in some situations, but I'd suggest we make this a display option - to be respectful of other cultures means that we should respect their writing and their culture wherever possible. IPA could be handled in the same way, and this would probably be appreciated by at least a few people who would otherwise be confused on how to pronounce a given name.
Would a format for usernames with foreign characters like "User:???????? (Arabic) <#2352562>" really be so bad? (This would apply equally to projects where latin characters might not be displayable).
-Stephanie
Hoi, At issue is that the function of NOT seeing the characters, hence the ??????, is a function of your local system. It is easily remedied by installing fonts. These are for the majority of languages available as part of your operating system. Making this a display option will get you into problems too. It would help if we knew what the primary language is of the person involved. That is something that is relatively easy to address as we have to address it in the "Multilingual MediaWiki". Here it is however not compulsory to make this info available.
I want to repeat my question I posted before: Is this the Wikimedia Foundation that allows people to edit its project anonymously ? Does this whole idea not reek of bad faith ? How hard is it to have people install fonts if they object to having to see ??????
Thanks, GerardM
2006/12/21, Gerard Meijssen gerard.meijssen@gmail.com:
Is this the Wikimedia Foundation that allows people to edit its project anonymously ? Does this whole idea not reek of bad faith ? How hard is it to have people install fonts if they object to having to see ??????
A minor point, but...
What about the people who edit from computers on which they can not install software? Public computers, computers at work, the parents' or boyfriends' computer. Lots of people do this from time to time. Then we have people like me. I've tried a couple of times to get Japanese characters in my box simply because I think it would be cool but I have failed, maybe because my OS is old and stuff is not supported any more - maybe because I just did it wrong. (No info on how to do that in this thread, please. I promise you, the first three things you'll tell me I have already tried - and it does not belong on this list, anyhow.)
/HB
I would like to say they even have no need to install fonts ... just turning their "browser" language encoding setting from US-en or Latin-1 (iso-8859-1) to UTC-8. Sometimes font installation is necessary, but in most cases tuning browser setting is adequate, I found it from my experience.
On 12/21/06, habj sweetadelaide@gmail.com wrote:
2006/12/21, Gerard Meijssen gerard.meijssen@gmail.com:
Is this the Wikimedia Foundation that allows people to edit its project anonymously ? Does this whole idea not reek of bad faith ? How hard is it to have people install fonts if they object to having to see ??????
A minor point, but...
What about the people who edit from computers on which they can not install software? Public computers, computers at work, the parents' or boyfriends' computer. Lots of people do this from time to time. Then we have people like me. I've tried a couple of times to get Japanese characters in my box simply because I think it would be cool but I have failed, maybe because my OS is old and stuff is not supported any more - maybe because I just did it wrong. (No info on how to do that in this thread, please. I promise you, the first three things you'll tell me I have already tried - and it does not belong on this list, anyhow.)
/HB _______________________________________________ foundation-l mailing list foundation-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/foundation-l
On 12/21/06, Aphaia aphaia@gmail.com wrote:
I would like to say they even have no need to install fonts ... just turning their "browser" language encoding setting from US-en or Latin-1 (iso-8859-1) to UTC-8. Sometimes font installation is necessary, but in most cases tuning browser setting is adequate, I found it from my experience.
You mean UTF-8. UTC-8 is Los-Angeles-centric ;)
2006/12/21, Gerard Meijssen gerard.meijssen@gmail.com:
At issue is that the function of NOT seeing the characters, hence the ??????, is a function of your local system. It is easily remedied by installing fonts. These are for the majority of languages available as part of your operating system.
Well, I do see where this is not as easy as you might seem it to be. To do this, you will need to know: * which character set the name is in * where to download a font for that language to install.
I am going around Wikipedia a lot, and sometimes get to languages where I don't have a font installed. Usually I find those Wikipedias where such problems arise (that is, those that are written in relatively rare alphabets rather than Cyrillic or such) have a link to a downloadable freeware font on an easy-to-find page (the Main Page or its talk page). But if someone would use such a font on English Wikipedia, you might not even know whether it's Thai, Kannada or Cree, let alone where to download a font.
Hoi, At issue is that the function of NOT seeing the characters, hence the ??????, is a function of your local system. It is easily remedied by installing fonts. These are for the majority of languages available as part of your operating system. Making this a display option will get you into problems too. It would help if we knew what the primary language is of the person involved. That is something that is relatively easy to address as we have to address it in the "Multilingual MediaWiki". Here it is however not compulsory to make this info available.
I want to repeat my question I posted before: Is this the Wikimedia Foundation that allows people to edit its project anonymously ? Does this whole idea not reek of bad faith ? How hard is it to have people install fonts if they object to having to see ??????
*I* know I need to install fonts for proper multilingual support, but doing that doesn't get me multilingual text input, and very, very few people who have not already had to deal with such things as input methods will know how to input characters in foreign character sets, much less know what they are to input them correctly.
So the argument stands that this is as much a technical problem as a social one - and yes, the attitude of going against anything you don't understand, or inconveniences you slightly is one of our worst examples of bad faith - and I venture to say a typical Americanism.
This technical problem is not likely to be confined to en though, it's just better managed in cultures that are multilingual, since Wikipedians exposed to such cultures are more likely to already have or have been exposed to the tools (fonts and input methods) needed to work efficiently with foreign character sets, and are more likely to know how to use them.
The biggest thing we need to solve IMHO, is making sure that it's possible to type usernames, or something corresponding to usernames, without needing to have special input methods set up, and without using any characters that can't be directly typed on a typical keyboard for the language in question. Until a better solution arises, the ability to use the numeric userid as a "surrogate" for the username in URLs and text entry seems to be the safest bet.
-Stephanie
Neil Harris schreef:
James Hare wrote:
I don't know if IPA would work... I mean for all we know, "Bob" could have some random pronounciation of "ko-ho-ba-fay-jo-muh". When it comes to storing it in the database, I agree it should be some uniform script, like a Federation Standard of sorts...
Even "ko-ho-ba-fay-jo-muh" is better than ????????
Of course, we could always take the approach of putting them in IPA, thus annoying everyone equally.
-- Neil
Hoi. When you get ?????? it makes sense to install more fonts. The use of IPA is problematic because it is often done incorrectly and most people have not had training to do IPA correctly. It takes a lot of training to do this correctly. My name for instance is notoriously difficult to pronounce for people who are not Dutch while even a name like "Bob" should be written in many ways when you consider in how many ways it is pronounced.
When IPA is used correctly, you will get characters that people will not be able to read and consequently your problem is not different from the one you tried to solve.
Thanks, GerardM
On 21/12/06, Gerard Meijssen gerard.meijssen@gmail.com wrote:
When you get ?????? it makes sense to install more fonts. The use of IPA is problematic because it is often done incorrectly and most people have not had training to do IPA correctly. It takes a lot of training to do this correctly. My name for instance is notoriously difficult to pronounce for people who are not Dutch while even a name like "Bob" should be written in many ways when you consider in how many ways it is pronounced.
Indeed. But we're not going to be able to make it a requirement of editing en:wp that users install more fonts on their (arguably broken) default-install Windows box.
- d.
David Gerard wrote:
On 21/12/06, Gerard Meijssen gerard.meijssen@gmail.com wrote:
When you get ?????? it makes sense to install more fonts. The use of IPA is problematic because it is often done incorrectly and most people have not had training to do IPA correctly. It takes a lot of training to do this correctly. My name for instance is notoriously difficult to pronounce for people who are not Dutch while even a name like "Bob" should be written in many ways when you consider in how many ways it is pronounced.
Indeed. But we're not going to be able to make it a requirement of editing en:wp that users install more fonts on their (arguably broken) default-install Windows box.
- d.
And, as I keep on saying, the addition of the correct fonts to the user's browser does not magically grant the ability to read or even recognize characters in scripts that a reader is unused to. ("His name? It's three little boxes, and in the boxes there are a lot of little straight lines, apart from the ones that are like little hooks or squares. In the leftmost box, there's a bit of a gap at the bottom left, and then there's a sort of hooky bit hanging out at the right of the middle box, which has a few less lines, and then...")
-- Neil
On 21/12/06, Neil Harris usenet@tonal.clara.co.uk wrote:
And, as I keep on saying, the addition of the correct fonts to the user's browser does not magically grant the ability to read or even recognize characters in scripts that a reader is unused to. ("His name? It's three little boxes, and in the boxes there are a lot of little straight lines, apart from the ones that are like little hooks or squares. In the leftmost box, there's a bit of a gap at the bottom left, and then there's a sort of hooky bit hanging out at the right of the middle box, which has a few less lines, and then...")
Obviously, with SUL any en:wp editor who does not learn to read all other scripts used on Wikimedia is simply being a cultural imperialist.
- d.
David Gerard schreef:
On 21/12/06, Neil Harris usenet@tonal.clara.co.uk wrote:
And, as I keep on saying, the addition of the correct fonts to the user's browser does not magically grant the ability to read or even recognize characters in scripts that a reader is unused to. ("His name? It's three little boxes, and in the boxes there are a lot of little straight lines, apart from the ones that are like little hooks or squares. In the leftmost box, there's a bit of a gap at the bottom left, and then there's a sort of hooky bit hanging out at the right of the middle box, which has a few less lines, and then...")
Obviously, with SUL any en:wp editor who does not learn to read all other scripts used on Wikimedia is simply being a cultural imperialist.
- d.
Hoi, There is at the moment a big row brewing over the way the Internet insists on having URLs in Latin script. People who do not write in Latin have to use Latin to be able to use the Internet. This notion of being able to enter URLs in your own script is really intuitive. When this change is going to be implemented you will still be able to access the websites that will no longer be available with a Latin script.
It will be a big change when it happens, this issue raised about the insistence of the English language Wikipedia is a harbinger of many more issues that will come.
Thanks, GerardM
On 21/12/06, Gerard Meijssen gerard.meijssen@gmail.com wrote:
David Gerard schreef:
On 21/12/06, Neil Harris usenet@tonal.clara.co.uk wrote:
And, as I keep on saying, the addition of the correct fonts to the user's browser does not magically grant the ability to read or even recognize characters in scripts that a reader is unused to. ("His name? It's three little boxes, and in the boxes there are a lot of little straight lines, apart from the ones that are like little hooks or squares. In the leftmost box, there's a bit of a gap at the bottom left, and then there's a sort of hooky bit hanging out at the right of the middle box, which has a few less lines, and then...")
Obviously, with SUL any en:wp editor who does not learn to read all other scripts used on Wikimedia is simply being a cultural imperialist.
There is at the moment a big row brewing over the way the Internet insists on having URLs in Latin script. People who do not write in Latin have to use Latin to be able to use the Internet. This notion of being able to enter URLs in your own script is really intuitive. When this change is going to be implemented you will still be able to access the websites that will no longer be available with a Latin script. It will be a big change when it happens, this issue raised about the insistence of the English language Wikipedia is a harbinger of many more issues that will come.
This does not appear to answer either message above.
You're not explaining your point in terms other than appearing to automatically dislike any suggestion from native English speakers on this thread, so it's hard to see what you're putting forward as any sort of solution to the problems raised other than "fuck off." Is that your actual intended message? Or will you at some stage answer this thread with something that acknowledges and addresses the problems raised?
- d.
2006/12/21, Gerard Meijssen gerard.meijssen@gmail.com:
David Gerard schreef:
On 21/12/06, Neil Harris usenet@tonal.clara.co.uk wrote:
And, as I keep on saying, the addition of the correct fonts to the user's browser does not magically grant the ability to read or even recognize characters in scripts that a reader is unused to. ("His name? It's three little boxes, and in the boxes there are a lot of little straight lines, apart from the ones that are like little hooks or squares. In the leftmost box, there's a bit of a gap at the bottom
left,
and then there's a sort of hooky bit hanging out at the right of the middle box, which has a few less lines, and then...")
Obviously, with SUL any en:wp editor who does not learn to read all other scripts used on Wikimedia is simply being a cultural imperialist.
- d.
Hoi, There is at the moment a big row brewing over the way the Internet insists on having URLs in Latin script. People who do not write in Latin have to use Latin to be able to use the Internet. This notion of being able to enter URLs in your own script is really intuitive. When this change is going to be implemented you will still be able to access the websites that will no longer be available with a Latin script.
It will be a big change when it happens, this issue raised about the insistence of the English language Wikipedia is a harbinger of many more issues that will come.
Thanks, GerardM
Hi Gerard,
Sorry just a short comment, maybe I misunderstood your statement.
Actually (or at least, it was the case a year and a half ago), you can have URL in non-latin scripts. I have spent some time working in Seoul and all my colleagues were daily using URLs written in hangul (the local script). It was indeed a big surprise for me as I thought it was not possible, and I was glad they had this capability.
Cheers, Jerome
Gerard Meijssen wrote:
David Gerard schreef:
On 21/12/06, Neil Harris usenet@tonal.clara.co.uk wrote:
And, as I keep on saying, the addition of the correct fonts to the user's browser does not magically grant the ability to read or even recognize characters in scripts that a reader is unused to. ("His name? It's three little boxes, and in the boxes there are a lot of little straight lines, apart from the ones that are like little hooks or squares. In the leftmost box, there's a bit of a gap at the bottom left, and then there's a sort of hooky bit hanging out at the right of the middle box, which has a few less lines, and then...")
Obviously, with SUL any en:wp editor who does not learn to read all other scripts used on Wikimedia is simply being a cultural imperialist.
- d.
Hoi, There is at the moment a big row brewing over the way the Internet insists on having URLs in Latin script. People who do not write in Latin have to use Latin to be able to use the Internet. This notion of being able to enter URLs in your own script is really intuitive. When this change is going to be implemented you will still be able to access the websites that will no longer be available with a Latin script.
It will be a big change when it happens, this issue raised about the insistence of the English language Wikipedia is a harbinger of many more issues that will come.
Thanks, GerardM
Gerard,
You might be surprised to hear that I'm one of the many people working on that particular problem. The sub-problems needed to be solved in order to implement native-script display for all of the components of IRIs in a way that is secure, interoperable, and reliable are non-trivial. Sorting out Wikipedia usernames is a relatively trivial problem when compared to IRIs.
-- Neil
David Gerard schreef:
On 21/12/06, Gerard Meijssen gerard.meijssen@gmail.com wrote:
When you get ?????? it makes sense to install more fonts. The use of IPA is problematic because it is often done incorrectly and most people have not had training to do IPA correctly. It takes a lot of training to do this correctly. My name for instance is notoriously difficult to pronounce for people who are not Dutch while even a name like "Bob" should be written in many ways when you consider in how many ways it is pronounced.
Indeed. But we're not going to be able to make it a requirement of editing en:wp that users install more fonts on their (arguably broken) default-install Windows box.
- d.
Hoi, You do not have to make it a requirement as it is their option to upgrade their system with better font support. As there is a reasonable solution that is open to people, it is unreasonable to expect others to accommodate them by insisting on having a transliteration, a Latin nick, having people appear to be a number. Thanks, GerardM
On 21/12/06, Gerard Meijssen gerard.meijssen@gmail.com wrote:
You do not have to make it a requirement as it is their option to upgrade their system with better font support. As there is a reasonable solution that is open to people, it is unreasonable to expect others to accommodate them by insisting on having a transliteration, a Latin nick, having people appear to be a number.
"Wikipedia: The encyclopedia any linguist who has system access to add fonts to their computer can edit!"
I don't see why being able to enter a numeric ID for a username you don't have an input method for is so intrinsically offensive, except that it's being proposed by a native English speaker.
- d.
Gerard Meijssen wrote:
David Gerard schreef:
On 21/12/06, Gerard Meijssen gerard.meijssen@gmail.com wrote:
When you get ?????? it makes sense to install more fonts. The use of IPA is problematic because it is often done incorrectly and most people have not had training to do IPA correctly. It takes a lot of training to do this correctly. My name for instance is notoriously difficult to pronounce for people who are not Dutch while even a name like "Bob" should be written in many ways when you consider in how many ways it is pronounced.
Indeed. But we're not going to be able to make it a requirement of editing en:wp that users install more fonts on their (arguably broken) default-install Windows box.
- d.
Hoi, You do not have to make it a requirement as it is their option to upgrade their system with better font support. As there is a reasonable solution that is open to people, it is unreasonable to expect others to accommodate them by insisting on having a transliteration, a Latin nick, having people appear to be a number.\
"I am a man, not a number!"...
2006/12/21, David Gerard dgerard@gmail.com:
Indeed. But we're not going to be able to make it a requirement of editing en:wp that users install more fonts on their (arguably broken) default-install Windows box.
And they don't have to. You can still edit Wikipedia just fine while seeing boxes or question marks instead of the names of some of the other contributors, just like you can edit it just fine while seeing these instead of the titles of interwiki links.
Neil Harris wrote:
James Hare wrote:
On Wikipedia-l or something similar there has been discussion involving English Wikipedia's policy of blocking users with usernames that do not use the Latin alphabet. Reasons for opposition to this practice include ethnocentricism and messing up SUL. However, a point was raised in that people unfamiliar with the script will just see it as a bunch of squiggly-lines. A suggested remedy was having people transliterate their names depending on the wiki.
I'm interested in combining this with a script similar to the Automatic Conversion script employed on the Chinese Wikipedia, that would, combined with SUL, automatically transliterate usernames contingent on the wiki they are on. For example, on French Wikipedia, your username would be in Latin script, whereas on the Hebrew Wikipedia your username would be in Hebrew script and on Arabic Wikipedia your username would be in Arabic script.
I know there are scripts out there that can transliterate -- in addition to the aforementioned conversion script on the Chinese Wikipedia, there are quite a few scripts out there that will allow you to input something and have it output in a different script.
http://vereb.free.fr/transliteration/transliterator.html has a pretty cool one that allows you to type with Latin alphabet settings and it will output in a different script. For example,
- MessedRocker becomes МесседРоцкер with the Cyrillic setting
- MessedRocker becomes ΜεσσεδΡοcκερ with the Greek setting
- MessedRocker becomes مِصِدرُطكِر with the Arabic setting.
If something could be created for SUL that would take a username and transliterate it depending on the language of the wiki, that would be great. I understand there are technical issues involved, but I would like to discuss it on a community level.
--James
If we can make it as simple as that (and acknowledge that the automatic transliteration will often be very, very, bad) we could possibly make this work, with the option to change your nick to something else later.
Perhaps language -> IPA -> language might be a good way of helping the m x n problem for this, in which case we could use language <-> IPA tables to bootstrap this.
Would this be a good idea, or would it be linguistic nonsense?
My signature on en: is currently the IPA transliteration of how I pronounce my username.
Neil Harris wrote:
If we can make it as simple as that (and acknowledge that the automatic
transliteration will often be very, very, bad) we could possibly make this work, with the option to change your nick to something else later.
Perhaps language -> IPA -> language might be a good way of helping the m x n problem for this, in which case we could use language <-> IPA tables to bootstrap this.
Would this be a good idea, or would it be linguistic nonsense?
Probably linguistic nonsense. Although it serves a very useful purpose in conveying pronunciations, anly a limited community interested in linguistics will be familiar with it. For everybody else (which to be fair must include the English speakers) it amounts to attaining equality by putting everything into a common language which all will equally not understand. :-)
Ec
What I would like to know is how a machine will be able to deduce a pronounciation through a username.
On 12/24/06, Ray Saintonge saintonge@telus.net wrote:
Neil Harris wrote:
If we can make it as simple as that (and acknowledge that the automatic
transliteration will often be very, very, bad) we could possibly make this work, with the option to change your nick to something else later.
Perhaps language -> IPA -> language might be a good way of helping the m x n problem for this, in which case we could use language <-> IPA tables to bootstrap this.
Would this be a good idea, or would it be linguistic nonsense?
Probably linguistic nonsense. Although it serves a very useful purpose in conveying pronunciations, anly a limited community interested in linguistics will be familiar with it. For everybody else (which to be fair must include the English speakers) it amounts to attaining equality by putting everything into a common language which all will equally not understand. :-)
Ec
foundation-l mailing list foundation-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/foundation-l
Hoi, It does not. There is no standard yet that includes the differentiation between spoken dialects. With the ISO-639-6 this will become to differentiate between different linguistic entities even for pronunciations. When this standard becomes functional, it will be possible to identify a word as to be associated with a specific spoken linguistic entity and consequently software will be able to guess how it needs to be pronounced. Thanks, GerardM
On 12/24/06, James Hare messedrocker@gmail.com wrote:
What I would like to know is how a machine will be able to deduce a pronounciation through a username.
On 12/24/06, Ray Saintonge saintonge@telus.net wrote:
Neil Harris wrote:
If we can make it as simple as that (and acknowledge that the
automatic
transliteration will often be very, very, bad) we could possibly make this work, with the option to change your nick to something else later.
Perhaps language -> IPA -> language might be a good way of helping the
m
x n problem for this, in which case we could use language <-> IPA
tables
to bootstrap this.
Would this be a good idea, or would it be linguistic nonsense?
Probably linguistic nonsense. Although it serves a very useful purpose in conveying pronunciations, anly a limited community interested in linguistics will be familiar with it. For everybody else (which to be fair must include the English speakers) it amounts to attaining equality by putting everything into a common language which all will equally not understand. :-)
Ec
foundation-l mailing list foundation-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/foundation-l
foundation-l mailing list foundation-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/foundation-l
If we all sound like Stephen Hawking maybe the machine will guess that we are all geniuses. :-)
Ec
James Hare wrote:
What I would like to know is how a machine will be able to deduce a pronounciation through a username.
On 12/24/06, Ray Saintonge saintonge@telus.net wrote:
Neil Harris wrote:
If we can make it as simple as that (and acknowledge that the automatic transliteration will often be very, very, bad) we could possibly make this work, with the option to change your nick to something else later.
Perhaps language -> IPA -> language might be a good way of helping the m x n problem for this, in which case we could use language <-> IPA tables to bootstrap this.
Would this be a good idea, or would it be linguistic nonsense?
Probably linguistic nonsense. Although it serves a very useful purpose in conveying pronunciations, anly a limited community interested in linguistics will be familiar with it. For everybody else (which to be fair must include the English speakers) it amounts to attaining equality by putting everything into a common language which all will equally not understand. :-)
wikimedia-l@lists.wikimedia.org