On Thu, Aug 22, 2002 at 10:11:41AM -0700, lcrocker(a)nupedia.com wrote:
The English Wikipedia, and the German one being tested now, are
both ISO-8859-1, not UTF-8. UTF-8 will be needed for Polish and
other languages. There won't be much software change involved;
just telling MySQL to index the right way.
That may in fact involve defining our own new character set for MySQL that
defines the properties of the subset of UTF-8 that covers English, German
and Polish. Or is each Wikipedia going to get its own mysql server? Anyway,
I'll start asking around if something like that not already exists
somewhere.
As for a special notation for accented characters,
I'm not fond
of the idea. Foreign users should have foreign keyboards. Others
should still be able to enter accents by whatever means their OS
and browser allow, and I'm not aware of any that don't have some
feature for it.
All I know at the moment is that the request has been made by a member of
the German community. I don't know how many people asked for it, why they
wanted it or how badly they need it, but I'll ask them. I'm a bit surprised
that Magnus hasn't brought this up, (I'm not German) but I have the
impression he has been busy lately.
I don't like duplicating effort that should be
already done elsewhere.
The question is not if you would implement it, but only if it would be Ok to
define some hooks so that they can implement it themselves if they wanted
to without changing any common code.
-- Jan Hidders