[Wikitech-l] Use binary to represent varchar rather than UTF8

howard chen howachen at gmail.com
Wed Jan 10 02:47:15 UTC 2007


On 1/10/07, Ashar Voultoiz <hashar at altern.org> wrote:
> howard chen wrote:
> > in all table creation statments, such as
> >
> > CREATE TABLE `categorylinks` (
> >   `cl_from` int(8) unsigned NOT NULL default '0',
> >   `cl_to` varchar(255) binary NOT NULL default '',
> >   `cl_sortkey` varchar(255) binary NOT NULL default '',
> >   `cl_timestamp` timestamp(14) NOT NULL,
> >   UNIQUE KEY `cl_from` (`cl_from`,`cl_to`),
> >   KEY `cl_sortkey` (`cl_to`,`cl_sortkey`(128)),
> >   KEY `cl_timestamp` (`cl_to`,`cl_timestamp`)
> > ) TYPE=InnoDB;
> >
> >
> > why use binary to represent varchar, not UTF8?
>
> Brion answered that earlier:
>
> http://lists.wikimedia.org/pipermail/mediawiki-l/2006-February/010267.html
>
> To quote him, utf8 in database will mostly:
>
> * Make indexes larger (3 bytes per character)
> * Cause failures if you use characters outside the BOM in page titles,
> usernames, etc.
>
> cheers,
>
> --
> Ashar Voultoiz - WP++++
> http://en.wikipedia.org/wiki/User:Hashar
> http://www.livejournal.com/community/wikitech/
> IM: hashar at jabber.org  ICQ: 15325080
>
>
> _______________________________________________
> Wikitech-l mailing list
> Wikitech-l at lists.wikimedia.org
> http://lists.wikimedia.org/mailman/listinfo/wikitech-l
>

i would like to know if `wikipedia` is going to use this method in the future?




More information about the Wikitech-l mailing list