At 09:59 AM 6/19/2006, Guettarda wrote:
Don't forget the accounts which exist to prevent impersonation in one's main language - names with capital I's to mimic L's and Cyrillic characters. Deleting these would just open things up for abuse again.
Is there are a reason why user names with weird Unicode characters are even allowed? It would seem sensible to limit user names on each Wikipedia to the alphabet that is used in that language.
Chl
On 6/19/06, Chris Lüer chris@zandria.net wrote:
Is there are a reason why user names with weird Unicode characters are even allowed? It would seem sensible to limit user names on each Wikipedia to the alphabet that is used in that language.
That's probably excessively restrictive. I don't see that a name like "Erik Möller" or "Café Paris" should necessarily be banned...but some of the more esoteric characters are definitely more trouble than they're worth.
Steve
Chris Lüer wrote:
At 09:59 AM 6/19/2006, Guettarda wrote:
Don't forget the accounts which exist to prevent impersonation in one's main language - names with capital I's to mimic L's and Cyrillic characters. Deleting these would just open things up for abuse again.
Is there are a reason why user names with weird Unicode characters are even allowed? It would seem sensible to limit user names on each Wikipedia to the alphabet that is used in that language.
Chl
A suggestion, based on practices used for IDN registration:
Restrict new usernames on the en: Wikipedia to characters from the Latin alphabet and selected punctuation only (and possibly digits as well).
Before allowing a username to be registered, generate a canonical comparison form by Unicode normalization, lowercasing, punctuation and space suppression and accent-stripping, followed by homograph canonicalizations such as mapping both digit zero and letter O to the latter, digit 1 and letter L to the latter, eth to lowercase d, etc.
A new username should then only be allowed to be registered if the comparison form of the proposed new username is different from the comparison form of every existing username (which are stored in an indexed table, alongside the full, uncanonicalized name that actually gets registered).
Doing this will eliminate the vast majority of all simple username spoofing hacks.
Existing usernames get grandfathered in, of course.
-- Neil
I remember having an imposter with a weird unicode version of the first "N" in my username. In fact, it's my only imposter.
I think such usernames should be blocked too.
- Nathan (nathanrdotcom)
Nathan wrote:
I remember having an imposter with a weird unicode version of the first "N" in my username. In fact, it's my only imposter.
I think such usernames should be blocked too.
- Nathan (nathanrdotcom)
I think *your* username is innappropriate since you're spamming your URL in it.
The only problem is languages that don't have latin alphabets.
You can't restrict them to being only one charset. For example, Russian. I don't want to be User:ильянеп , but then again there may be users that use cyrillic in their names.
On 6/20/06, Neil Harris usenet@tonal.clara.co.uk wrote:
Chris Lüer wrote:
At 09:59 AM 6/19/2006, Guettarda wrote:
Don't forget the accounts which exist to prevent impersonation in one's
main
language - names with capital I's to mimic L's and Cyrillic characters. Deleting these would just open things up for abuse again.
Is there are a reason why user names with weird Unicode characters are even allowed? It would seem sensible to limit user names on each Wikipedia to the alphabet that is used in that language.
Chl
A suggestion, based on practices used for IDN registration:
Restrict new usernames on the en: Wikipedia to characters from the Latin alphabet and selected punctuation only (and possibly digits as well).
Before allowing a username to be registered, generate a canonical comparison form by Unicode normalization, lowercasing, punctuation and space suppression and accent-stripping, followed by homograph canonicalizations such as mapping both digit zero and letter O to the latter, digit 1 and letter L to the latter, eth to lowercase d, etc.
A new username should then only be allowed to be registered if the comparison form of the proposed new username is different from the comparison form of every existing username (which are stored in an indexed table, alongside the full, uncanonicalized name that actually gets registered).
Doing this will eliminate the vast majority of all simple username spoofing hacks.
Existing usernames get grandfathered in, of course.
-- Neil
WikiEN-l mailing list WikiEN-l@Wikipedia.org To unsubscribe from this mailing list, visit: http://mail.wikipedia.org/mailman/listinfo/wikien-l
Chris Lüer wrote:
Don't forget the accounts which exist to prevent impersonation in one's main language - names with capital I's to mimic L's and Cyrillic characters. Deleting these would just open things up for abuse again.
Is there are a reason why user names with weird Unicode characters are even allowed? It would seem sensible to limit user names on each Wikipedia to the alphabet that is used in that language.
Well, some people have real names with weird unicode letters, Chris Lüer. ;-)