Andre Oliveira da Costa wrote:
Brion Vibber wrote:
This is normal; set $wgLegacyEncoding for runtime conversion of old text entries.
... wow, that's it? ;-) I thought it would be harder =) Thank God it's simple...
;)
I'll try it ASAP and post here the results. However, a couple of questions:
- shouldn't upgrade1_5.php convert text entries as well?
No.
What's the point in "half converting" to utf-8?
Ask that again when _you_ have sixteen million precompressed text records in Latin-1 and hundreds of people calling for your blood every minute the site is offline for the upgrade. ;)
- will $wgLegacyEncoding be around on future releases? After all, remaining non-utf8 chars will be on the DB forever since the upgrade script didn't catch them.
Yes.
- would a "manual" conversion using iconv (as erchache2000 suggested on this thread) after upgrade1_5.php and update.php have been applied successfully convert all DB from latin-1 to utf-8?
Depends on your database...
Could this have any side-effects that could compromise the DB? (eg. if images or any other binary data is stored on the DB, I don't know if iconv is smart enough not to mess with "latin-1 chars" it might find within binary content)
Yes, that would corrupt any compressed text entries. If you're using compressed old records you'd have to decompress the whole table before running such a conversion.
-- brion vibber (brion @ pobox.com)