-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Jim Hu wrote:
I'm thinking this may be a php bug rather than a mw problem - but I'm wondering how to get around it. I generate MW xml for importing pages and I use htmlentities to encode things for xml. But I just saw a problem with the XML parser failing to recognize the ± entity.
± has no inherent meaning in XML; it would have to be defined via the doctype or directly in a processor directive in the document.
Instead of htmlentities(), use htmlspecialchars() which is safe for XML by only using the XML-predefined character references &, <, >, and ".
Ensure your text is properly encoded (eg, UTF-8 unless your XML file is otherwise marked.)
- -- brion