Message: 14 Date: Sat, 11 Jan 2003 21:46:07 -0800 From: Jonathan Walther krooger@debian.org To: wikitech-l@wikipedia.org Subject: Re: [Wikitech-l] Re: Wikitech-l digest, Vol 1 #329 - 13 msgs Reply-To: wikitech-l@wikipedia.org --zYM0uCDKw75PZbzx Content-Type: text/plain; charset=us-ascii; format=flowed Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Fri, Jan 10, 2003 at 12:28:53PM -0800, Brion Vibber wrote:
What would you prefer? That we tell Anthere to take a
hike and buy a new
computer? That I petition my uni to upgrade hundreds
of machines in
their labs? That we ignore similar conditions across
the world where
people have old machines or machines they cannot
control and tell them,
hey, fuck off, Wikipedia's not for you you whiny
bitch? Your examples are legitimate. How would you feel if there was a user option to edit in "broken UTF-8 mode"? Then when you edited a page, you could insert some markup to put in non-ASCII characters. I don't know what the best way to do this would be; I am guessing something like \xAB\xCD where \x means "an 8 bit value in hexadecimal representation follows". If you have any other ideas, let me know. Jonathan
This could not make it in french. We have accentuated letters in an awful number of words. That would make editing very difficult.
When is Jimbo coming back from holidays ?
__________________________________________________ Do you Yahoo!? Yahoo! Mail Plus - Powerful. Affordable. Sign up now. http://mailplus.yahoo.com
On Sunday 12 January 2003 03:23, Anthere wrote:
Your examples are legitimate. How would you feel if there was a user option to edit in "broken UTF-8 mode"? Then when you edited a page, you could insert some markup to put in non-ASCII characters. I don't know what the best way to do this would be; I am guessing something like \xAB\xCD where \x means "an 8 bit value in hexadecimal representation follows". If you have any other ideas, let me know. Jonathan
This could not make it in french. We have accentuated letters in an awful number of words. That would make editing very difficult.
If the character is between 128 and 255 inclusive, present it as a single byte. If it's Greek, give the HTML character name. Else turn it into a number.
We could have a preference for what encoding to use on edit screens. Any character not in that encoding is represented as a number, unless it's Greek or for some reason has a character name.
phma
On Sun, 12 Jan 2003 03:32:20 -0500, Pierre Abbat phma=ce9h4FcxEoVIf6P1QZMOBw@public.gmane.org wrote:
If the character is between 128 and 255 inclusive, present it as a single byte. If it's Greek, give the HTML character name. Else turn it into a number.
Actually, if it's between 128 and 159, reject it outright. Characters with bytecodes between those values have no meaning on the web at all. Unfortunately, they have meaning in the default"Windows" character set, so a certain Word processor from a very large software corporation with a poor reputation litters its documents with #146, #147 etc in the guise of "smart quotes", and these fail to render on some good browsers. Perhaps the input processor could clean the text, replacing these characters with unicode equivalents via a lookup table?
On Sun, Jan 12, 2003 at 12:23:06AM -0800, Anthere wrote:
Your examples are legitimate. How would you feel if there was a user option to edit in "broken UTF-8 mode"? Then when you edited a page, you could insert some markup to put in non-ASCII characters. I don't know what the best way to do this would be; I am guessing something like \xAB\xCD where \x means "an 8 bit value in hexadecimal representation follows". If you have any other ideas, let me know.
This could not make it in french. We have accentuated letters in an awful number of words. That would make editing very difficult.
What if we had a TeX mode for diacritics? TeX makes it fairly easy to put diacritic marks over and under letters. This isn't meant to be the default editing mode, just a mode for people with broken browsers.
Jonathan
On Sun, Jan 12, 2003 at 12:23:06AM -0800, Anthere wrote:
On Fri, Jan 10, 2003 at 12:28:53PM -0800, Brion Vibber wrote:
What would you prefer? That we tell Anthere to take a
hike and buy a new
computer? That I petition my uni to upgrade hundreds
of machines in
their labs? That we ignore similar conditions across
the world where
people have old machines or machines they cannot
control and tell them,
hey, fuck off, Wikipedia's not for you you whiny
bitch? Your examples are legitimate. How would you feel if there was a user option to edit in "broken UTF-8 mode"? Then when you edited a page, you could insert some markup to put in non-ASCII characters. I don't know what the best way to do this would be; I am guessing something like \xAB\xCD where \x means "an 8 bit value in hexadecimal representation follows". If you have any other ideas, let me know.
Jonathan
This could not make it in french. We have accentuated letters in an awful number of words. That would make editing very difficult.
French accentuated letters are part of Latin-1, they could be edited directly. Only more foreign alphabetes would suffer. In an article about Lech Walesa you could not directly input the stroken-through "l" of his last name but would see something like ł
See for example http://www.wikipedia.org/w/wiki.phtml?title=List_of_Polish_prime_ministers http://www.wikipedia.org/w/wiki.phtml?title=China for a articles on the English wikipedia having those kind of letters.
A UTF-8 capable browser would present all letters directly in the edit window. A Non-UTF-8-browser would present the accentuated characters in a numeric form like it's already used in parts of the english wikipedia.
Making UTF-8 edits an option would ease the life of people working on Asian, eastern European, Hebrew ... topics while still allowing edits by older browsers.
Best regards,
JeLuF
wikitech-l@lists.wikimedia.org