[Wikipedia-l] Switching everything to UTF-8

Daniel Mayer maveric149 at yahoo.com
Tue Nov 18 20:39:32 UTC 2003


Peter wrote:
>Not at all! It clearly shows what happens when a page

>_IS_ 8859-1 encoded but editors want to use fancy 
>characters. Same hppens when they do it with an old 
>browser on UTF-8 pages. So, you get trash either way,

>other editors revert the same way, so you may well 
>use utf-8, don't you? :-)

OIC. Well a person should /not/ be using fancy things
like curly quotes and long hyphens because many
browsers (especially on non-MS systems) display them
as question makes. These should be fixed, not allowed
to propagate. The fact that some browsers break these
codes should be a good hint that they should not be
used to begin with (esp. since ASCII quotes and
regular hyphens can be used instead). 

But a way around the larger issue is to sniff whether
or not a browser is UTF-8-aware and then serve a page
in either UTF-8 or in Latin-1 (whatever the ISO) based
on that. When a UTF-8 page is displayed it shows the
actual non-Latin characters, when the Latin-1 page is
displayed it shows the codes the represent those
characters. 

That at least will prevent pages from getting damaged,
but the special characters will still show up as
question marks for people with older browsers, so
things like curly quotes and long hyphens should be
automatically converted to their ASCII counterparts.

-- Daniel Mayer (aka mav)

__________________________________
Do you Yahoo!?
Protect your identity with Yahoo! Mail AddressGuard
http://antispam.yahoo.com/whatsnewfree



More information about the Wikipedia-l mailing list