[Mediawiki-l] Mysql, UTF-8: How is it supposed to work?
Dorthe Luebbert
luebbert at globalpark.de
Fri Feb 17 12:01:40 UTC 2006
Hi,
I wonder how the UTF-8-Support in Mediawiki works and what valid
combinations of database charsets and output charsets are.
As far as I understand in version 1.5 the default character set has
changed to UTF-8. Therefore I suppose Mediawiki stores HTML-entities in
the database per default (because Mysql 4.0 does not fully support
UTF-8). Right?
Yesterday we tried to upgrade a 1.5x-Media-Wiki to Mysql 4.1 (the server
was upgraded and the wiki was unfortunately affected). We found a
character set mess within the latin1-database, which we cleaned up by
find/replace in the dump file. Now we have UTF8 content in the database,
the character set for the tables is set to UTF-8 and utf8 is used as
charset in the output. We also enabled the Mysql5-experimental flag.
Some parts of the page work all right, some do not (e.g. page titles),
this was mentioned in the changelog file as todo.
Now it's broken and I would like to which combination is supposed to
work. Is this one a possible combination?
Database: Mysql 4.1
PHP: 5.1
Database-charset: Latin1, all content in the database is latin1
Output-charset: UTF-8
Thanks for any hint.
Regards
Dorthe
More information about the MediaWiki-l
mailing list