Hi Brion,
On Wed, 16 Nov 2005 13:46:40 -0800
Brion Vibber <brion(a)pobox.com> wrote:
Andre Oliveira da Costa wrote:
Brion Vibber wrote:
This is normal; set $wgLegacyEncoding for runtime
conversion of old text
entries.
... wow, that's it? ;-) I thought it would be harder =) Thank God it's
simple...
;)
I'll try it ASAP and post here the results.
However, a couple of questions:
- shouldn't upgrade1_5.php convert text entries as well?
No.
What's the point in "half
converting" to utf-8?
Ask that again when _you_ have sixteen million precompressed text
records in Latin-1 and hundreds of people calling for your blood every
minute the site is offline for the upgrade. ;)
Ok, got the message ;-) Yeah, judging from this perspective, it makes perfect
sense. And, after all, all that matters is that at the end of the
day conversion (be it partial or full) works, and all content is accounted for
and properly accessible after the upgrade (confirmation still pending on my
case! ;-))
- will
$wgLegacyEncoding be around on future releases? After all, remaining
non-utf8 chars will be on the DB forever since the upgrade script didn't
catch them.
Yes.
Cool. I rest my case then ;-)
- would a
"manual" conversion using iconv (as erchache2000 suggested on this
thread) after upgrade1_5.php and update.php have been applied successfully
convert all DB from latin-1 to utf-8?
Depends on your database...
Could this have any side-effects that
could compromise the DB? (eg. if images or any other binary data is stored
on the DB, I don't know if iconv is smart enough not to mess with "latin-1
chars" it might find within binary content)
Yes, that would corrupt any compressed text entries. If you're using
compressed old records you'd have to decompress the whole table before
running such a conversion.
Ok. In other words: "don't do it unless you're pretty sure you know what
you're doing" ;-) I didn't even know there could be compressed text records,
but they're as much of a problem on this case as any binary file. As for
me, I'll let it be as it is ;-)
Thks for all the help. I will try the $wgLegacyEncoding at home and will
report here the results.
Best,
Andre
PS: BTW: IMHO some of this conversation could be summarized on the UPGRADE
docs, it could save others some worries.
--
Andre Oliveira da Costa
(costa(a)tecgraf.puc-rio.br)