[Mediawiki-l] Sudden problem with some greek andcyrillic letters

Tim Starling tstarling at wikimedia.org
Sun May 6 13:35:04 UTC 2007

Ian Smith wrote:
>> * Due to the limitations of MySQL's Unicode support, but default we
> continue to treat MySQL fields as binary and store pure UTF-8 Unicode
> in them, although MySQL may have them listed as Latin-1 depending on
> your server's defaults.
> Surely this is a bug?  If MW wants binary fields, then surely it should
> explicitly create them as binary, instead of leaving it up to some
> random server default?

In MySQL 4.0, there were no table or column character sets, there was only
a server character set. You could specify a "binary" modifier on columns,
altering the collation, which we duly did. Our 4.0-compatible schema thus
uses binary collations for varchar columns, but does not specify a
character set, since there was no way to do that in MySQL 4.0.

As of MediaWiki 1.9, there is an installer option to select a "MySQL 5
binary" schema, which does specify a binary character set.

-- Tim Starling

