Re: [Wikitech-l] Re: Surrogate pair hack

26 Oct 2005


      Tim Starling wrote:
...
Tim Starling wrote:
...
MySQL 5.0 does not reject the surrogate characters between U+D800 and
U+DFFF. This means we can store characters above the BMP either by setting
the character set to UTF-8 and inserting CESU-8, or by setting the character
set to UCS-2 and inserting UTF-16.
Sorry, I didn't realise that this subject has already been discussed on this
list:
But it sounds much more plausible when you say it. ;)
In all seriousness, if we do have to go that route we actually have most
of the plumbing already in place it looks like. For Oracle and
PostgreSQL we're already adding special treatment for the binary data
fields, which with some tweaking could distinguish 'text - for
conversion' and 'data - leave as is' on SQL generation. If result sets
come with the proper type information then doing the conversion back
should be easy and transparent.
"In theory." :)
-- brion vibber (brion @ pobox.com)

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] Re: Surrogate pair hack