Tim Starling wrote:
Tim Starling wrote:
MySQL 5.0 does not reject the surrogate characters between U+D800 and U+DFFF. This means we can store characters above the BMP either by setting the character set to UTF-8 and inserting CESU-8, or by setting the character set to UCS-2 and inserting UTF-16.
Sorry, I didn't realise that this subject has already been discussed on this list:
But it sounds much more plausible when you say it. ;)
In all seriousness, if we do have to go that route we actually have most of the plumbing already in place it looks like. For Oracle and PostgreSQL we're already adding special treatment for the binary data fields, which with some tweaking could distinguish 'text - for conversion' and 'data - leave as is' on SQL generation. If result sets come with the proper type information then doing the conversion back should be easy and transparent.
"In theory." :)
-- brion vibber (brion @ pobox.com)