Hi,
I wonder how the UTF-8-Support in Mediawiki works and what valid combinations of database charsets and output charsets are.
As far as I understand in version 1.5 the default character set has changed to UTF-8. Therefore I suppose Mediawiki stores HTML-entities in the database per default (because Mysql 4.0 does not fully support UTF-8). Right?
Yesterday we tried to upgrade a 1.5x-Media-Wiki to Mysql 4.1 (the server was upgraded and the wiki was unfortunately affected). We found a character set mess within the latin1-database, which we cleaned up by find/replace in the dump file. Now we have UTF8 content in the database, the character set for the tables is set to UTF-8 and utf8 is used as charset in the output. We also enabled the Mysql5-experimental flag. Some parts of the page work all right, some do not (e.g. page titles), this was mentioned in the changelog file as todo.
Now it's broken and I would like to which combination is supposed to work. Is this one a possible combination? Database: Mysql 4.1 PHP: 5.1 Database-charset: Latin1, all content in the database is latin1 Output-charset: UTF-8
Thanks for any hint.
Regards
Dorthe