On Apr 6, 2005, at 2:59pm, John Blumel wrote:
Has anyone had success with submitting data that
contains non-English
characters via a bot? I'm currently working on some Perl scripts to
extract and upload data... but I haven't had much success with
uploading articles non-standard characters.
Well, out of sheer stubbornness, and after a couple of days in
character encoding hell, I finally figured out how to get this to work.
It wasn't exactly intuitive but, if I ensure that the data files
containing the GftP article are Unicode (UTF-8) encoded and then encode
them as Latin-1 (ISO-8859-1) in the bot script, the submissions to the
wiki, which is configured for UTF-8, go through without any data
corruption. This also works for the page titles which had been not
getting escaped properly.
I'll leave it for someone else to explain why this works this way.
John Blumel