I'm not sure I'm the right person to raise this question but I wondered what the current thinking is on adapting the code for other character sets. If I recall correctly we or now assuming UTF-8, right? What exactly does that mean, btw? That we changed the MySQL character tables for those above 7F? Anyting else?
I'm asking this because I know that some writers on the German Wikipedia asked for an easy way to type accents. So I wondered if we should add a special pre-parse function that is language-dependent, i.e, defined in language??.php, and is called on edit text, title strings and search expressions before they are processed. The Germans could define it for example such that ö is always translated to "o (or they could in fact use '"o' as a notation, if they wanted). They could then unambiguously search for Goedel. I assume that the Polish would want similar stuff, and perhaps the people at Vikipedio would also like a special notatation for their special letters, but I know the situation is a bit more complicated there.
-- Jan Hidders