Hi Brion,
On Wed, 16 Nov 2005 13:46:40 -0800 Brion Vibber brion@pobox.com wrote:
Andre Oliveira da Costa wrote:
Brion Vibber wrote:
This is normal; set $wgLegacyEncoding for runtime conversion of old text entries.
... wow, that's it? ;-) I thought it would be harder =) Thank God it's simple...
;)
I'll try it ASAP and post here the results. However, a couple of questions:
- shouldn't upgrade1_5.php convert text entries as well?
No.
What's the point in "half converting" to utf-8?
Ask that again when _you_ have sixteen million precompressed text records in Latin-1 and hundreds of people calling for your blood every minute the site is offline for the upgrade. ;)
Ok, got the message ;-) Yeah, judging from this perspective, it makes perfect sense. And, after all, all that matters is that at the end of the day conversion (be it partial or full) works, and all content is accounted for and properly accessible after the upgrade (confirmation still pending on my case! ;-))
- will $wgLegacyEncoding be around on future releases? After all, remaining non-utf8 chars will be on the DB forever since the upgrade script didn't catch them.
Yes.
Cool. I rest my case then ;-)
- would a "manual" conversion using iconv (as erchache2000 suggested on this thread) after upgrade1_5.php and update.php have been applied successfully convert all DB from latin-1 to utf-8?
Depends on your database...
Could this have any side-effects that could compromise the DB? (eg. if images or any other binary data is stored on the DB, I don't know if iconv is smart enough not to mess with "latin-1 chars" it might find within binary content)
Yes, that would corrupt any compressed text entries. If you're using compressed old records you'd have to decompress the whole table before running such a conversion.
Ok. In other words: "don't do it unless you're pretty sure you know what you're doing" ;-) I didn't even know there could be compressed text records, but they're as much of a problem on this case as any binary file. As for me, I'll let it be as it is ;-)
Thks for all the help. I will try the $wgLegacyEncoding at home and will report here the results.
Best,
Andre
PS: BTW: IMHO some of this conversation could be summarized on the UPGRADE docs, it could save others some worries.