-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Jared Williams wrote:
I patched UtfNormal.php to use the new intl extension's normalization function, http://php.net/manual/en/normalizer.normalize.php
But running UtfNormalTest.php causes 200+ errors, whilst the PHP normalization routines are error free.
So kind at a standstill wondering what the problem is. I guess WM use utf8_normalize() which is based on ICU like intl, does than have the same errors?
It may be that intl is using an ICU library which supports an older version of Unicode. UtfNormal is currently built using the Unicode 5.1 files, as I recall, so will have the 5.1 test cases.
For the most part, this means that the newer version includes mapping rules for newly-added characters -- normalization is deliberately designed to be compatible and existing rules should never change. (Though there might be a few exceptions changed as errata, I forget.)
- -- brion