On 1/10/07, howard chen <howachen(a)gmail.com> wrote:
besides fully UTF8 support, using binary will have the
advantages of
saving space for indexes, so will it also alter the decision?
It is possible that when MySQL gets unicode support beyond UCS-16 it
will come in the form of UTF-8, which is what we currently use packed
into those binary fields.
This isn't that all unlikely. Supporting non-BMP characters without
dealing with variable length characters requires UCS-32, and I think
that even MySQL users would balk at another needless 2x increase in
the size of their ASCII data. Since the effort required to work with
UTF-16 (i.e. the two byte variable length encoding) is similar to
UTF-8, it may make sense to go all the way to UTF-8 and get the space
savings for ASCII.
It's also possible that by the time MySQL has support for non-BMP
characters, it may have support for functional indexes, allowing the
index to operate on a different datatype than the row... (although a
straightforward type conversion would kill one of the primary
advantages of using a real string type: collation which isn't total
nonsense)
(and please trim your replies)