On 1/10/07, howard chen howachen@gmail.com wrote:
On 1/10/07, Ashar Voultoiz hashar@altern.org wrote:
howard chen wrote:
in all table creation statments, such as
CREATE TABLE `categorylinks` ( `cl_from` int(8) unsigned NOT NULL default '0', `cl_to` varchar(255) binary NOT NULL default '', `cl_sortkey` varchar(255) binary NOT NULL default '', `cl_timestamp` timestamp(14) NOT NULL, UNIQUE KEY `cl_from` (`cl_from`,`cl_to`), KEY `cl_sortkey` (`cl_to`,`cl_sortkey`(128)), KEY `cl_timestamp` (`cl_to`,`cl_timestamp`) ) TYPE=InnoDB;
why use binary to represent varchar, not UTF8?
Brion answered that earlier:
http://lists.wikimedia.org/pipermail/mediawiki-l/2006-February/010267.html
To quote him, utf8 in database will mostly:
- Make indexes larger (3 bytes per character)
- Cause failures if you use characters outside the BOM in page titles,
usernames, etc.
cheers,
-- Ashar Voultoiz - WP++++ http://en.wikipedia.org/wiki/User:Hashar http://www.livejournal.com/community/wikitech/ IM: hashar@jabber.org ICQ: 15325080
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org http://lists.wikimedia.org/mailman/listinfo/wikitech-l
i would like to know if `wikipedia` is going to use this method in the future?
I'm sure mediawiki will make use of proper utf-8 support when mysql has proper utf-8 support. Our users want to use Unicode characters beyond plane 1 and the current mysql utf-8 support does not make this possible.
Andrew Dunbar (hippietrail)
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org http://lists.wikimedia.org/mailman/listinfo/wikitech-l