MySQL 5.0 does not reject the surrogate characters between U+D800 and U+DFFF. This means we can store characters above the BMP either by setting the character set to UTF-8 and inserting CESU-8, or by setting the character set to UCS-2 and inserting UTF-16.
It might be a hack, but at least it's a standard hack. After all, this is exactly what UTF-16 and CESU-8 were designed for. According to a certain online encyclopedia that we all know and love, the exact same problem is observed in Oracle, and CESU-8 is used to solve it.
Implementing it in MediaWiki would be a similar task to implementing PostgreSQL bytea support, which I was talking about in #mediawiki the other day. We could convert from UTF-8 to CESU-8 in Database::strencode(), and decode by altering the result rows returned by mysql_fetch_object() before they are returned by Database::fetchObject().
Binary data would have to be flagged both on write and read. On write, we could use a function other than strencode() where there is a need to construct raw SQL, and for the wrapper functions we could pass binary data in a blob object. When reading from the database, we could probably use mysql_field_type() or mysql_field_flags() to detect binary columns and skip the conversion accordingly.
It'd probably be easiest to do conversion offline, in the process of switching to MySQL 5.0. We wouldn't be converting the bulk text, just the metadata.
-- Tim Starling