I recently dumped (well, someone else dumped it) the data from a MediaWiki database stored in a MySQL v3.23.58 server and imported it into a v4.0.21 server.
I noticed on the new instance of the wiki, running on MySQL v4.0.21, that there were some pages where the text wasn't displaying properly and on Editing the page, the 'bad' data was replaced with a number of question marks. Without saving the Edit, I noticed that characters in the database were garbled (I temporarily do not have access to the original site so I can't verify the exact original data but it was served correctly there just before that server was taken off-line and the dump produced.) and that the "garbling" originated in the dump file (probably created in the dump process).
Is this a known issue and is there a way to prevent/correct it? I know that MySQL v4.0.x is recommended for MediaWiki but the reasons given seem to be related to performance.
I saw a note in the list archive related to similar issues with MySQL v4.1.x,
http://mail.wikipedia.org/pipermail/mediawiki-l/2004-November/ 002245.html
but I'm not certain this is exactly the same issue since the dump didn't actually turn them into question marks and I can't positively identify the characters from the original database that were corrupted. I think one of them was 0xe8 or 0xe9 (è or é, if those display correctly in this email -- è or é in HTML encoding) but, being a hopeless English speaking monoglot, I don't know for sure which would have been used.
John Blumel