Alexander Prudnikov wrote:
Hello.
I have 2 questions about UTF-8 encoding in wiki.
- How is the UTF-8 encoding encoded and decoded into other encodings?
Where in the sources can I find it? And what additional libraries or software (except php, apache etc.) should I have in order to encode/decode UTF-8? 2. How can I define (in the code) what encoding my Wiki uses? I mean what variable contain information about the encoding?
Hello,
1/ It doesnt make sens to decode an UTF-8 encoded text to ASCII for example. Utf-8 offer much more characters that you will not be able to correctly translate, same for ISO-8859-1 (wich doesnt have the oelig; ).
I don't think you need any specific library for php / apache. The only thing needed is to output an http header saying wich encoding is used so the browser correctly decode the text.
2/ The default MediaWiki encoding is set to ISO-8859-1 through the $wgInputEncoding and $wgOutputEncoding of ./includes/DefaultSettings.php
When you configure the language to be used in LocalSettings.php ( example: $wgLanguageCode = "fr"; ), the software will include a language specific script in ./languages. The Fr one in turn load LanguageUtf8.php that set the encoding options.
So basicly: set $wgInputEncoding and $wgOutPutEncoding in your Language file and it should works :o) I recommend using utf-8.