[Mediawiki-l] bots and non-english characters?

John Blumel johnblumel at earthlink.net
Fri Apr 8 06:28:44 UTC 2005


On Apr 6, 2005, at 2:59pm, John Blumel wrote:

> Has anyone had success with submitting data that contains non-English 
> characters via a bot? I'm currently working on some Perl scripts to 
> extract and upload data... but I haven't had much success with 
> uploading articles non-standard characters.

Well, out of sheer stubbornness, and after a couple of days in 
character encoding hell, I finally figured out how to get this to work. 
It wasn't exactly intuitive but, if I ensure that the data files 
containing the GftP article are Unicode (UTF-8) encoded and then encode 
them as Latin-1 (ISO-8859-1) in the bot script, the submissions to the 
wiki, which is configured for UTF-8, go through without any data 
corruption. This also works for the page titles which had been not 
getting escaped properly.

I'll leave it for someone else to explain why this works this way.


John Blumel




More information about the MediaWiki-l mailing list