On Friday, 4th August 2006 at 09:54:23 (GMT +0100), neil@nwjones.demon.co.uk wrote:
The processing involves inserting text directly into the middle of the wikimedia dumps
It's important what text editor you use to insert that text. Some text editors can only display UTF-8 encoded texts but are unable to save them properly.
I regularly use the method you decribe to manually edit and re-upload phpBB SQL backups (produced by phpBB's own built-in backup facility), and everything works fine, including all accented characters and Russian, Arabic or Chinese sentences. (And this even though phpBB's default distribution encoding is iso-8859-1 and we had to convert all configuration and language files into UTF-8 manually.)