Hi y'all
Background
I have an old WIki in ISO-8859-1format and tried to upgrade to 1.5.4. I ran the upgrade1_5.php which stated should convert to UTF-8. The problem is that it didn't. upn closer inspection of the script it was clear that the global variable $wgUseLatin1 should have a non false value for it to convert enything. In an attempt to minimize side effects I set $wgUseLatin1=true in the subrotine that does the conversion. This gave me converted pagenames, but the page contents were still ISO. Strengthened by my success I simply applied the conversion subroutine on the data the comes from the cur.text field. This seems to have worked, as far as I can tell my data is now properly UTF-8 formatted.
Question
1. Mediawiki is supposed to convert old ISO format article text to UTF-8 on the fly, right. How is this triggered, Do I need to set $wgUseLatin1 to get it to work or what.
2. I now have a converted wiki, converted as descibed above is there a downside my aproach. Does some text not get converted.
------------------------------------------------- Anders Nygård Operations Specialist Gl. Køge Landevej 55 2500 Valby Denmark Phone 45-7730 12 00 Direct 45-7730 12 74 Mobile 45-4144 38 77 www.uni2.dk
anders.nygard wrote:
- Mediawiki is supposed to convert old ISO format article text to UTF-8
on the fly, right. How is this triggered, Do I need to set $wgUseLatin1 to get it to work or what.
$wgLegacyEncoding.
-- brion vibber (brion @ pobox.com)
Brion Vibber escribió:
anders.nygard wrote:
- Mediawiki is supposed to convert old ISO format article text to UTF-8
on the fly, right. How is this triggered, Do I need to set $wgUseLatin1 to get it to work or what.
$wgLegacyEncoding.
-- brion vibber (brion @ pobox.com)
MediaWiki-l mailing list MediaWiki-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/mediawiki-l
Brion, with this i dont need to upgrade to utf-8? Install on a blank site and only change database?
erchache2000 wrote:
Brion Vibber escribió:
anders.nygard wrote:
- Mediawiki is supposed to convert old ISO format article text to
UTF-8 on the fly, right. How is this triggered, Do I need to set $wgUseLatin1 to get it to work or what.
$wgLegacyEncoding.
Brion, with this i dont need to upgrade to utf-8? Install on a blank site and only change database?
When a text blob is loaded from the 'text' table which has an old_flags value not including the 'utf-8' tag, the text will be transcoded from the setting in $wgLegacyEncoding to UTF-8 during loading.
This effects only text coming out of 'text' (or 'archive' for deleted revisions). It does not apply to usernames, page titles, comments, and other database fields which are used directly.
To successfully upgrade an old database which contained ISO-8859-1 data, you must transcode all these other fields to UTF-8. You can either find your own way to do this (iconv, playing around with encoding support in MySQL 4.1+, etc), or you can very carefully try using the upgrade1_5.php script which we used to upgrade from 1.4 to 1.5 schema.
This script was made for internal use and is not documented. Please read through it to confirm that it does what you need and that you have configured things correctly for it before considering its use.
-- brion vibber (brion @ pobox.com)
mediawiki-l@lists.wikimedia.org