Hi,
some of the things that I've learned while working on Mediawiki I18N:
* I18N is not just translating messages. There's more:
* Different numbers are used by different languages
* In some languages, you have to take care of grammatical cases,
e.g. genitive, which is hard to notice if you don't speak those
languages.
* Some languages don't use italics. Don't hardcode <i> to emphasize
something.
* Some languages have prefixes to words. In English, there's the
's' suffix to make a noun plural, other languages use prefixes
for this, or they combine the article and the noun. Links
have to work different on those wikis.
* Words are not always separated by whitespace. Some languages
have one letter per word. Links should not extend to following
characters like they do in English (e.g. [[house]]s)
* Unicode support is a well of never ending joy
* Writing direction switches don't work as you'd expect, esp. when
users on RTL wikis use LTR usernames. E.g. when you have a
number, a username and a timestamp and don't take care, the
timestamp might by right or left of the username, depending on
the username (Remember: "Arabic" digits (123..) are LTR).
* Some letters look the same, but have different codepoints.
A (Latin Capital Letter a) and A (Greek Capital Letter alpha).
This allows the creation of user names and pages which resemble
already existing pages/users. Nice for vandals.
* There are different ways to encode the same character.
Diaecritical letters (like the German Umlauts) have their own
codepoint, but can also be combined using a base letter and
a diaecritical sign. Lot's of fun if you upload images from a
Mac and try to insert them in an article from a Windows/Linux
computer. Don't forget to normalize all user input.
* PHP's support for Unicode string manipulation is poor. Many of
the above mentioned problems forced us to write custom extensions
for PHP.
MediaWiki started with internationalization in mind, but the authors
of it were only speaking western languages (English, German, French,
Spanish, Esperanto) and many of the problems mentioned above required
code changes. There are still open issues. E.g., in piedmontese,
there's a conflict between certain language constructs requiring
multiple ticks (') and the wiki syntax for bold/italics.
So yes, MediaWiki is quite good when it comes to internationalization,
but it's still far from perfect.
Regards,
jens