I don't understand:
- I expected the two first tries to work
- and the last one to fail.
What happened is the exact opposite!
I am totally confused.
I don't even know how to ask my question properly.
I think that I understand what "é" and %E9 are...
but I do not understand what é is...
moreover it is two characters "Ã" and "©" instead
of one...
Thank you for your help :) .
Best regards,
--
Lmhelp
The letter é has the codepoint 0xE9 in Unicode.
If the file is written in iso-8859-1, it is represented by just one
byte: 0xE9 (é)
If the file is written in utf-8, it is represented by two bytes:
0xC3 0xA9 (é)
If the file is written in utf-16, it is represented by two bytes:
0x00 0xA9 in utf-16 BE and 0xA9 0x00 in utf-16 LE.
The line <?xml version="1.0" encoding="UTF-8"?> says "this
file will be
in utf-8". If you then write "Etoilé " as 0x45 0x74 0x6f 0x69 0x6c 0xe9
0x20, that makes invalid XML, since it should have been 0x45 0x74 0x6f
0x69 0x6c 0xc3 0xa9 0x20 (alternatively, you could have specified a
different encoding in the prolog).
The use of %E9 is just a trick for urls, since they may not allow a
literal "é" there (this url é would be encoded in iso-8859). It only
appears in robots.txt because it talks about urls.