Well, it definitely seems that the preg_replace_callback issue is a PHP bug, so I've filed a report:
http://bugs.php.net/bug.php?id=40871
I still think there's a MediaWiki issue here, since it's generating the bad UTF-8 in the first place: I can't see it in the database, after all. This is a violation of the Unicode standard.
So, comments? Should I report an MW bug?
Ian