Hi again,
I attempted to submit a sample Wkipedia page to the W3C validator and was amzaed to see that carried no character encoding whatsoever. Here is the full report from the validator -
I was not able to extract a character encoding labeling from any of the valid sources for such information. Without encoding information it is impossible to validate the document. The sources I tried are: The HTTP Content-Type field. The XML Declaration. The HTML "META" element.
And I even tried to autodetect it using the algorithm defined in Appendix F of the XML 1.0 Recommendation. Since none of these sources yielded any usable information, I will not be able to validate this document. Sorry. Please make sure you specify the character encoding in use. IANA maintains the list of official names for character sets. ---- <end quote> Surely Wikipedia ought to use an encoding such as UTF-8?