Hi,
we are running mediawiki v1.4.0 with a iso-8859-1 DB, and as we tried to upgrade, locale conversion from iso-8859-1 to UT-8 didn't work as all latin1 chars like çéõ etc. got replaced by ",". This would be "only" a PITA if we had to manually convert accented chars by hand, but it turned out that the converstion of URLs with latin1 chars sent some pages into a blackhole -- they simply seem to disappear.
I filed a bug about this (bug #3898 http://bugzilla.wikimedia.org/show_bug.cgi?id=3898), please check the link for more details. But I was hoping anyone on this ML would know how to work around this...
Any of you guys had to do such conversion? If so, did it work 100%? Until I can successfully convert the DB locale, we're stuck with 1.4.0 = (
Any help will be much appreciated.
TIA
Andre
Andre Oliveira da Costa wrote:
Hi,
we are running mediawiki v1.4.0 with a iso-8859-1 DB, and as we tried to upgrade, locale conversion from iso-8859-1 to UT-8 didn't work as all latin1 chars like çéõ etc. got replaced by ",". This would be "only" a PITA if we had to manually convert accented chars by hand, but it turned out that the converstion of URLs with latin1 chars sent some pages into a blackhole -- they simply seem to disappear.
I filed a bug about this (bug #3898 http://bugzilla.wikimedia.org/show_bug.cgi?id=3898), please check the link for more details. But I was hoping anyone on this ML would know how to work around this...
Any of you guys had to do such conversion? If so, did it work 100%? Until I can successfully convert the DB locale, we're stuck with 1.4.0 = (
Any help will be much appreciated.
TIA
Andre
for what it's worth: you are not alone ...
i tried an online conversion of my database (http://www.kgv.nl) but was unsuccesfull. i'm going to try it localy, see where that leads us to.
as for the empty pages. they are still there. if you access tot phpmyadmin or any other mysql frontend, look into the content table, you will see the content there.
Ruud
Hi Ruud,
On Tue, 08 Nov 2005 12:44:36 +0100 ruud habets rhabets@kgv.nl wrote:
Andre Oliveira da Costa wrote:
Hi,
we are running mediawiki v1.4.0 with a iso-8859-1 DB, and as we tried to upgrade, locale conversion from iso-8859-1 to UT-8 didn't work as all latin1 chars like çéõ etc. got replaced by ",". This would be "only" a PITA if we had to manually convert accented chars by hand, but it turned out that the converstion of URLs with latin1 chars sent some pages into a blackhole -- they simply seem to disappear.
I filed a bug about this (bug #3898 http://bugzilla.wikimedia.org/show_bug.cgi?id=3898), please check the link for more details. But I was hoping anyone on this ML would know how to work around this...
Any of you guys had to do such conversion? If so, did it work 100%? Until I can successfully convert the DB locale, we're stuck with 1.4.0 = (
Any help will be much appreciated.
TIA
Andre
for what it's worth: you are not alone ...
i tried an online conversion of my database (http://www.kgv.nl) but was unsuccesfull. i'm going to try it localy, see where that leads us to.
Sorry to hear that (OTOH I'm glad I am not an isolated case ;-)).
as for the empty pages. they are still there. if you access tot phpmyadmin or any other mysql frontend, look into the content table, you will see the content there.
Well, that's something -- at least the content is not entirely lost. But, it should be a real PITA to recover "hidden" content this way.
To the developers: any chance the upgrade script can be fixed?
Ruud: if you make any progress, please post it here, I'll do the same.
TIA
Andre
I has similar problems, you can convert manually with iconv, and after try and pray for normal update.
On my system fails.
Do that:
iconv -f iso-8859-1 -t utf-8 olddatabase.sql > new_utf9_database.sql
and after try to update to 1.5.x from webbrowser directly, without upgrade1_5.php script. I was trying to use it and fails too.
Other question are images, i dont know if you need to convert image names too...
Tell me all result, im on same situation. All information are welcome...
Good Luck ;-D
Andre Oliveira da Costa escribió:
Hi Ruud,
On Tue, 08 Nov 2005 12:44:36 +0100 ruud habets rhabets@kgv.nl wrote:
Andre Oliveira da Costa wrote:
Hi,
we are running mediawiki v1.4.0 with a iso-8859-1 DB, and as we tried to upgrade, locale conversion from iso-8859-1 to UT-8 didn't work as all latin1 chars like çéõ etc. got replaced by ",". This would be "only" a PITA if we had to manually convert accented chars by hand, but it turned out that the converstion of URLs with latin1 chars sent some pages into a blackhole -- they simply seem to disappear.
I filed a bug about this (bug #3898 http://bugzilla.wikimedia.org/show_bug.cgi?id=3898), please check the link for more details. But I was hoping anyone on this ML would know how to work around this...
Any of you guys had to do such conversion? If so, did it work 100%? Until I can successfully convert the DB locale, we're stuck with 1.4.0 = (
Any help will be much appreciated.
TIA
Andre
for what it's worth: you are not alone ...
i tried an online conversion of my database (http://www.kgv.nl) but was unsuccesfull. i'm going to try it localy, see where that leads us to.
Sorry to hear that (OTOH I'm glad I am not an isolated case ;-)).
as for the empty pages. they are still there. if you access tot phpmyadmin or any other mysql frontend, look into the content table, you will see the content there.
Well, that's something -- at least the content is not entirely lost. But, it should be a real PITA to recover "hidden" content this way.
To the developers: any chance the upgrade script can be fixed?
Ruud: if you make any progress, please post it here, I'll do the same.
TIA
Andre
On Tue, 08 Nov 2005 14:58:20 +0100 erchache2000 erchache2000.enciclopedia@gmail.com wrote:
I has similar problems, you can convert manually with iconv, and after try and pray for normal update.
On my system fails.
Do that:
iconv -f iso-8859-1 -t utf-8 olddatabase.sql > new_utf9_database.sql
and after try to update to 1.5.x from webbrowser directly, without upgrade1_5.php script. I was trying to use it and fails too.
Other question are images, i dont know if you need to convert image names too...
Tell me all result, im on same situation. All information are welcome...
Good Luck ;-D
Hi,
thks for the feedback. I did thought about iconv, but I wasn't sure if it would break something else; according to your message, it's really not that safe. But, what exactly do you mean with "failed"? Did you still have missing page content after the conversion? Did it have any side-effects?
TIA
Andre
Andre Oliveira da Costa escribió:
Hi Ruud,
On Tue, 08 Nov 2005 12:44:36 +0100 ruud habets rhabets@kgv.nl wrote:
Andre Oliveira da Costa wrote:
Hi,
we are running mediawiki v1.4.0 with a iso-8859-1 DB, and as we tried to upgrade, locale conversion from iso-8859-1 to UT-8 didn't work as all latin1 chars like çéõ etc. got replaced by ",". This would be "only" a PITA if we had to manually convert accented chars by hand, but it turned out that the converstion of URLs with latin1 chars sent some pages into a blackhole -- they simply seem to disappear.
I filed a bug about this (bug #3898 http://bugzilla.wikimedia.org/show_bug.cgi?id=3898), please check the link for more details. But I was hoping anyone on this ML would know how to work around this...
Any of you guys had to do such conversion? If so, did it work 100%? Until I can successfully convert the DB locale, we're stuck with 1.4.0 = (
Any help will be much appreciated.
TIA
Andre
for what it's worth: you are not alone ...
i tried an online conversion of my database (http://www.kgv.nl) but was unsuccesfull. i'm going to try it localy, see where that leads us to.
Sorry to hear that (OTOH I'm glad I am not an isolated case ;-)).
as for the empty pages. they are still there. if you access tot phpmyadmin or any other mysql frontend, look into the content table, you will see the content there.
Well, that's something -- at least the content is not entirely lost. But, it should be a real PITA to recover "hidden" content this way.
To the developers: any chance the upgrade script can be fixed?
Ruud: if you make any progress, please post it here, I'll do the same.
TIA
Andre
MediaWiki-l mailing list MediaWiki-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/mediawiki-l
On Tue, 08 Nov 2005, Andre Oliveira da Costa wrote:
Hi,
we are running mediawiki v1.4.0 with a iso-8859-1 DB, and as we tried to upgrade, locale conversion from iso-8859-1 to UT-8 didn't work as all latin1 chars like çéõ etc. got replaced by ",". This would be "only" a PITA if we had to manually convert accented chars by hand, but it turned out that the converstion of URLs with latin1 chars sent some pages into a blackhole -- they simply seem to disappear.
I had the same problem, when using the upgrade1_5.php and update.php scripts. After I put this into my LocalSettings.php it worked though:
$wgLegacyEncoding = 'ISO-8859-1';
christof
Hi Christof,
On Tue, 8 Nov 2005 14:49:11 +0100 Christof Damian christof@damian.net wrote:
On Tue, 08 Nov 2005, Andre Oliveira da Costa wrote:
Hi,
we are running mediawiki v1.4.0 with a iso-8859-1 DB, and as we tried to upgrade, locale conversion from iso-8859-1 to UT-8 didn't work as all latin1 chars like çéõ etc. got replaced by ",". This would be "only" a PITA if we had to manually convert accented chars by hand, but it turned out that the converstion of URLs with latin1 chars sent some pages into a blackhole -- they simply seem to disappear.
I had the same problem, when using the upgrade1_5.php and update.php scripts. After I put this into my LocalSettings.php it worked though:
$wgLegacyEncoding = 'ISO-8859-1';
Mmmh... this looks promising. But with this setting did you need to run upgrade1_5.php after all? Or did this simply tell mediawiki to "accept" iso-8859-1, and you left your DB untouched?
If I were sure this setting is going to be around forever, I would probably go this way, too. But, I am affraid iso-8859-1 might not be supported at all in a near future and, in this case we would just be putting off an inevitable upgrade to UTF-8... Do you know if this setting is official or is it an "undocumented feature"?
TIA
Andre
I think.....
Because is OFFICIALLY UNDOCUMENTED, i dont have a specific url where i can know how to convert from iso to utf. (<jokemode> Yes im a lammer and what!... ;-P</jokemode>)
If you convert your database into UTF-8 you dont need to use upgrade1_5.php script, so you can upgrade 1.5.2 directly from webbrowser. I dont have any lost information using iconv, but i dont try it with images on file system, database is a thing, and files into /image dir other, you know? ;-)
And yes, to pass to 1.5.x is inevitable because is OBLIGATORY. Direct show problems....
Try to use iconv and upgrade directly like the braves.....send me all information, please. All information are welcome....
Good Luck, again....
p.d.: At this point im thinking to setup a "fucked by mediawiki software upgrades" forum....Programmers, dont be furious with me please, you are make a good work, but do all programmers do, dont document your programs....When open software will be open software with open documentation :-S....
Andre Oliveira da Costa escribió:
Hi Christof,
On Tue, 8 Nov 2005 14:49:11 +0100 Christof Damian christof@damian.net wrote:
On Tue, 08 Nov 2005, Andre Oliveira da Costa wrote:
Hi,
we are running mediawiki v1.4.0 with a iso-8859-1 DB, and as we tried to upgrade, locale conversion from iso-8859-1 to UT-8 didn't work as all latin1 chars like çéõ etc. got replaced by ",". This would be "only" a PITA if we had to manually convert accented chars by hand, but it turned out that the converstion of URLs with latin1 chars sent some pages into a blackhole -- they simply seem to disappear.
I had the same problem, when using the upgrade1_5.php and update.php scripts. After I put this into my LocalSettings.php it worked though:
$wgLegacyEncoding = 'ISO-8859-1';
Mmmh... this looks promising. But with this setting did you need to run upgrade1_5.php after all? Or did this simply tell mediawiki to "accept" iso-8859-1, and you left your DB untouched?
If I were sure this setting is going to be around forever, I would probably go this way, too. But, I am affraid iso-8859-1 might not be supported at all in a near future and, in this case we would just be putting off an inevitable upgrade to UTF-8... Do you know if this setting is official or is it an "undocumented feature"?
TIA
Andre
Its needless to say to make BACKUPS of all your database data and images before apply any change.
1- make a dump of your database. 2- make a cp from your database.sql to isodatabase.sql, to take a backup to apply utf8 conversion. 3- apply iconv: iconv -f iso-8859-1 -t utf-8 isodatabase.sql > utf8database.sql 4- make a cp from your utf8database.sql to test.sql, to take a backup to upgrade to 1.5.x. 4- Make upgrade on webbrowser with test.sql 5- If everything is good, your system are on 1.5.x version. (but on my specific system fails....)
Well, i think this is correct, its all i know about the process...
Good Luck...
Andre Oliveira da Costa escribió:
Hi Christof,
On Tue, 8 Nov 2005 14:49:11 +0100 Christof Damian christof@damian.net wrote:
On Tue, 08 Nov 2005, Andre Oliveira da Costa wrote:
Hi,
we are running mediawiki v1.4.0 with a iso-8859-1 DB, and as we tried to upgrade, locale conversion from iso-8859-1 to UT-8 didn't work as all latin1 chars like çéõ etc. got replaced by ",". This would be "only" a PITA if we had to manually convert accented chars by hand, but it turned out that the converstion of URLs with latin1 chars sent some pages into a blackhole -- they simply seem to disappear.
I had the same problem, when using the upgrade1_5.php and update.php scripts. After I put this into my LocalSettings.php it worked though:
$wgLegacyEncoding = 'ISO-8859-1';
Mmmh... this looks promising. But with this setting did you need to run upgrade1_5.php after all? Or did this simply tell mediawiki to "accept" iso-8859-1, and you left your DB untouched?
If I were sure this setting is going to be around forever, I would probably go this way, too. But, I am affraid iso-8859-1 might not be supported at all in a near future and, in this case we would just be putting off an inevitable upgrade to UTF-8... Do you know if this setting is official or is it an "undocumented feature"?
TIA
Andre
If you can upgrade to 1.5.x. Please write your story here: http://meta.wikimedia.org/wiki/Help:Upgrading_MediaWiki
I have a spanish wiki and iso conversión are more difficult.
bye.
erchache2000 escribió:
Its needless to say to make BACKUPS of all your database data and images before apply any change.
1- make a dump of your database. 2- make a cp from your database.sql to isodatabase.sql, to take a backup to apply utf8 conversion. 3- apply iconv: iconv -f iso-8859-1 -t utf-8 isodatabase.sql > utf8database.sql 4- make a cp from your utf8database.sql to test.sql, to take a backup to upgrade to 1.5.x. 4- Make upgrade on webbrowser with test.sql 5- If everything is good, your system are on 1.5.x version. (but on my specific system fails....)
Well, i think this is correct, its all i know about the process...
Good Luck...
Andre Oliveira da Costa escribió:
Hi Christof,
On Tue, 8 Nov 2005 14:49:11 +0100 Christof Damian christof@damian.net wrote:
On Tue, 08 Nov 2005, Andre Oliveira da Costa wrote:
Hi,
we are running mediawiki v1.4.0 with a iso-8859-1 DB, and as we tried to upgrade, locale conversion from iso-8859-1 to UT-8 didn't work as all latin1 chars like çéõ etc. got replaced by ",". This would be "only" a PITA if we had to manually convert accented chars by hand, but it turned out that the converstion of URLs with latin1 chars sent some pages into a blackhole -- they simply seem to disappear.
I had the same problem, when using the upgrade1_5.php and update.php scripts. After I put this into my LocalSettings.php it worked though: $wgLegacyEncoding = 'ISO-8859-1';
Mmmh... this looks promising. But with this setting did you need to run upgrade1_5.php after all? Or did this simply tell mediawiki to "accept" iso-8859-1, and you left your DB untouched? If I were sure this setting is going to be around forever, I would probably go this way, too. But, I am affraid iso-8859-1 might not be supported at all in a near future and, in this case we would just be putting off an inevitable upgrade to UTF-8... Do you know if this setting is official or is it an "undocumented feature"?
TIA
Andre
MediaWiki-l mailing list MediaWiki-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/mediawiki-l
Andre Oliveira da Costa wrote:
I had the same problem, when using the upgrade1_5.php and update.php scripts. After I put this into my LocalSettings.php it worked though:
$wgLegacyEncoding = 'ISO-8859-1';
Mmmh... this looks promising. But with this setting did you need to run upgrade1_5.php after all? Or did this simply tell mediawiki to "accept" iso-8859-1, and you left your DB untouched?
$wgLegacyEncoding tells the wiki to apply upconversion at load time on old text entries that don't have the 'utf-8' flag on them.
It does _not_ apply any conversion at run time for page titles, usernames, edit comments, etc. These are what upgrade1_5.php will apply transformations to if you have the reencoding trigger on. (This was created primarily for internal use at Wikimedia and is not well documented, sorry. Read the source code to check for configuration options.)
If I were sure this setting is going to be around forever, I would probably go this way, too. But, I am affraid iso-8859-1 might not be supported at all in a near future and, in this case we would just be putting off an inevitable upgrade to UTF-8... Do you know if this setting is official or is it an "undocumented feature"?
From 1.5 on *only* UTF-8 is supported.
$wgLegacyEncoding for upconverting old text entries (not anything else, just text entries) will continue to be there for the forseeable future.
-- brion vibber (brion @ pobox.com)
Hi Brion,
On Wed, 09 Nov 2005 12:23:57 -0800 Brion Vibber brion@pobox.com wrote:
Andre Oliveira da Costa wrote:
I had the same problem, when using the upgrade1_5.php and update.php scripts. After I put this into my LocalSettings.php it worked though:
$wgLegacyEncoding = 'ISO-8859-1';
Mmmh... this looks promising. But with this setting did you need to run upgrade1_5.php after all? Or did this simply tell mediawiki to "accept" iso-8859-1, and you left your DB untouched?
$wgLegacyEncoding tells the wiki to apply upconversion at load time on old text entries that don't have the 'utf-8' flag on them.
It does _not_ apply any conversion at run time for page titles, usernames, edit comments, etc. These are what upgrade1_5.php will apply transformations to if you have the reencoding trigger on. (This was created primarily for internal use at Wikimedia and is not well documented, sorry. Read the source code to check for configuration options.)
Ok, got it. Any chance this iso-8859-1 --> utf-8 problem will get fixed on future releases?
If I were sure this setting is going to be around forever, I would probably go this way, too. But, I am affraid iso-8859-1 might not be supported at all in a near future and, in this case we would just be putting off an inevitable upgrade to UTF-8... Do you know if this setting is official or is it an "undocumented feature"?
From 1.5 on *only* UTF-8 is supported.
$wgLegacyEncoding for upconverting old text entries (not anything else, just text entries) will continue to be there for the forseeable future.
Thks for the explanation. At least now I have a more precise picture of the problem...
Best,
Andre
Andre Oliveira da Costa wrote:
Ok, got it. Any chance this iso-8859-1 --> utf-8 problem will get fixed on future releases?
Well, it worked fine for us, so I'm not sure what there is to fix?
-- brion vibber (brion @ pobox.com)
On Wed, 09 Nov 2005 13:43:10 -0800 Brion Vibber brion@pobox.com wrote:
Andre Oliveira da Costa wrote:
Ok, got it. Any chance this iso-8859-1 --> utf-8 problem will get fixed on future releases?
Well, it worked fine for us, so I'm not sure what there is to fix?
Mmmh... "Houston, we got a problem" ;-)
Not sure why it didn't work for me, then. Does my description of the problem on bug #3898 [http://bugzilla.wikimedia.org/show_bug.cgi?id=3898] give you any clue about what could have failed on my upgrade (and, judging by some of the replies, on others' as well)? If not, what additional info can I provide to help you track down the problem?
Best,
Andre
Andre Oliveira da Costa wrote:
On Wed, 09 Nov 2005 13:43:10 -0800 Brion Vibber brion@pobox.com wrote:
Andre Oliveira da Costa wrote:
Ok, got it. Any chance this iso-8859-1 --> utf-8 problem will get fixed on future releases?
Well, it worked fine for us, so I'm not sure what there is to fix?
Mmmh... "Houston, we got a problem" ;-)
Not sure why it didn't work for me, then. Does my description of the problem on bug #3898 [http://bugzilla.wikimedia.org/show_bug.cgi?id=3898] give you any clue about what could have failed on my upgrade (and, judging by some of the replies, on others' as well)? If not, what additional info can I provide to help you track down the problem?
* Operating system and version * PHP version * PHP configuration * Which PHP modules are installed * MySQL version * MySQL configuration * LocalSettings.php * Any other modifications or settings on MediaWiki * Exact, step-by-step procedure you used for running the upgrade * If possible, sample data files
Chars being replaced by commas is not something I've ever seen.
-- brion vibber (brion @ pobox.com)
Brion Vibber wrote:
Andre Oliveira da Costa wrote:
On Wed, 09 Nov 2005 13:43:10 -0800 Brion Vibber brion@pobox.com wrote:
Andre Oliveira da Costa wrote:
Ok, got it. Any chance this iso-8859-1 --> utf-8 problem will get fixed on future releases?
Well, it worked fine for us, so I'm not sure what there is to fix?
Mmmh... "Houston, we got a problem" ;-)
Not sure why it didn't work for me, then. Does my description of the problem on bug #3898 [http://bugzilla.wikimedia.org/show_bug.cgi?id=3898] give you any clue about what could have failed on my upgrade (and, judging by some of the replies, on others' as well)? If not, what additional info can I provide to help you track down the problem?
- Operating system and version
- PHP version
- PHP configuration
- Which PHP modules are installed
- MySQL version
- MySQL configuration
- LocalSettings.php
- Any other modifications or settings on MediaWiki
- Exact, step-by-step procedure you used for running the upgrade
- If possible, sample data files
Chars being replaced by commas is not something I've ever seen.
-- brion vibber (brion @ pobox.com) _______________________________________________ MediaWiki-l mailing list MediaWiki-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/mediawiki-l
I have tracked down (actually a co-worker has the credits for this) my reason for failing to upgrade to this: - i have no access to a shell prompt to start php scripts manually - my mysql engine is of version 4.0. so there is no ALTER DATABASE command to change from latin1 to utf8 (and no, i don't know when my provider will upgrade this version)
ruud habets
Hi Brion,
On Wed, 09 Nov 2005 14:23:33 -0800 Brion Vibber brion@pobox.com wrote:
Andre Oliveira da Costa wrote:
On Wed, 09 Nov 2005 13:43:10 -0800 Brion Vibber brion@pobox.com wrote:
Andre Oliveira da Costa wrote:
Ok, got it. Any chance this iso-8859-1 --> utf-8 problem will get fixed on future releases?
Well, it worked fine for us, so I'm not sure what there is to fix?
Mmmh... "Houston, we got a problem" ;-)
Not sure why it didn't work for me, then. Does my description of the problem on bug #3898 [http://bugzilla.wikimedia.org/show_bug.cgi?id=3898] give you any clue about what could have failed on my upgrade (and, judging by some of the replies, on others' as well)? If not, what additional info can I provide to help you track down the problem?
- Operating system and version
- PHP version
- PHP configuration
- Which PHP modules are installed
- MySQL version
- MySQL configuration
- LocalSettings.php
- Any other modifications or settings on MediaWiki
- Exact, step-by-step procedure you used for running the upgrade
- If possible, sample data files
Chars being replaced by commas is not something I've ever seen.
Fair enough =)
I actually tested the upgrade on my home computer, which has FC4 with up-to-date packages. However, production server has RHEL AS 3, and packages versions differ significantly (eg. production MySQL is 3.23.58, home MySQL is 4.1.14). I am at work now, so I won't be able to gather all information you requested; I will get back to you later with all data.
I haven't tested the upgrade on the production server, so I can't tell right now if it works there or not (I will need to be very careful with its DB, and also these tests will imply in downtime, so it's not a trivial move; at home I can trash the DB anytime I need ;-)).
I might be away for a couple of days, so it's likely I can only provide required data next week. I will post info here ASAP.
Questions:
- PHP configuration = /etc/php.ini, /etc/php.d/* ? - MySQL configuration = /etc/my.cnf ? - can I send any files attached directly to your email?
TIA
Andre
Hi Brion,
I was finally able to dedicate some time to this. Below I post more detailed info.
On Wed, 09 Nov 2005 14:23:33 -0800 Brion Vibber brion@pobox.com wrote:
Andre Oliveira da Costa wrote:
On Wed, 09 Nov 2005 13:43:10 -0800 Brion Vibber brion@pobox.com wrote:
Andre Oliveira da Costa wrote:
Ok, got it. Any chance this iso-8859-1 --> utf-8 problem will get fixed on future releases?
Well, it worked fine for us, so I'm not sure what there is to fix?
Mmmh... "Houston, we got a problem" ;-)
Not sure why it didn't work for me, then. Does my description of the problem on bug #3898 [http://bugzilla.wikimedia.org/show_bug.cgi?id=3898] give you any clue about what could have failed on my upgrade (and, judging by some of the replies, on others' as well)? If not, what additional info can I provide to help you track down the problem?
- Operating system and version
FC4, all latets updates applied, kernel 2.6.14-1.1637_FC4
- PHP version
PHP 5.0.4
- PHP configuration
Should I send /etc/php.ini directly to your email address? It's quite big to be posted here...
- Which PHP modules are installed
php-pear-5.0.4-10.5 php-pgsql-5.0.4-10.5 php-jpgraph-1.19-1.2.fc4.rf php-snmp-5.0.4-10.5 php-mbstring-5.0.4-10.5 php-mysql-5.0.4-10.5
(I don't use all of them, some -- like snmp -- have been installed for testing purposes; actually, I don't really _use_ PHP at home, apache is off by default)
- MySQL version
MySQL 4.1.14
- MySQL configuration
~ cat /etc/my.cnf [mysqld] datadir=/var/lib/mysql socket=/var/lib/mysql/mysql.sock # Default to using old password format for compatibility with mysql 3.x # clients (those using the mysqlclient10 compatibility package). old_passwords=1
[mysql.server] user=mysql basedir=/var/lib
[mysqld_safe] err-log=/var/log/mysqld.log pid-file=/var/run/mysqld/mysqld.pid
- LocalSettings.php
I don't recall doing any tweaking to it, but I can forward it to you if you need it.
- Any other modifications or settings on MediaWiki
I haven't applied any modification on this test setup. 1.4.0 installation has some modifications, but they are related to access rights, not content. Still, if this is relevant, I can provide them to you.
- Exact, step-by-step procedure you used for running the upgrade
- extract content from mediawiki-1.5.2.tar.gz at /var/www/html - make symlink from /var/www/html/mediawiki-1.5.2 to /var/www/html/wiki PS: [wiki] = /var/www/html/wiki - create 'wikidb' DB on MySQL - create 'wikiuser' with full rights on wikidb.* - import MediaWiki 1.4.2 DB dump (with mysql -u wikiuser -h localhost -p wikidb < wikidb-dump.sql) - copy [wiki]/AdminSettings.sample to [wiki]/AdminSettings.php, configure it with proper MySQL root user/password - copy LocalSettings.php from current (1.4.0) installation to [wiki] dir - run 'php -f maintenance/upgrade1_5.php' from dir [wiki] (I captured its output in case it helps) - run 'php -f maintenance/update.php' (captured output of this one as well) - chown -R apache:apache [wiki]
After this, [host]/wiki is online, but content doesn't display right -- latin-1 chars are replaced by commas.
Investigating more thoroughly the problem I guess I now have a better understanding of what's happening: commas are just the way of Firefox telling me it found chars incompatible with the page encoding (sorry for the false alarm, should have realized this). In this case, latin-1 chars were left behind during the translation, and page encoding is utf-8. If I force the browser to use latin-1 encoding, page displays fine -- well, almost, some content seems to be missing.
Eg. this page:
http://shadow/wiki/index.php?title=Padr%F5es_de_Programa%E7%E3o&action=e...
which should point to a page titled "Padrões de Programação" appears as missing (i.e. link appears as ...&action=edit). If I go to the special "dead end pages" page, I see this link there:
http://shadow/wiki/index.php?title=Padr%C3%B5es_de_Programa%C3%A7%C3%A3o
If I follow it, "missing" content is there, with latin-1 chars (so we're back to the "commas" issue again). Page title is utf-8, but the remaining of the content is latin-1.
Judging by this, it seems the upgrade1_5.php script did convert URLs (and consequently page titles) from latin-1 to utf-8, but some or all of pages content was not converted.
I hope this helps you guys understand what happened -- and fix the upgrade script so that I can migrate to 1.5.2 =)
- If possible, sample data files
Well, I would be happy to provide configuration files for MySQL and PHP, and also output from conversion scripts. Just let me know if you guys would like to take a look at them, and where should I send them to. I can also provide some content if it will help.
Chars being replaced by commas is not something I've ever seen.
You're 100% right, it was a silly oversight. My bad, sorry for the confusion. Still, there is indeed some problem with the latin-1 --> utf-8 conversion process...
Any ideas? Anything else I could provide?
Best,
Andre
Andre Oliveira da Costa wrote:
If I follow it, "missing" content is there, with latin-1 chars (so we're back to the "commas" issue again). Page title is utf-8, but the remaining of the content is latin-1.
Judging by this, it seems the upgrade1_5.php script did convert URLs (and consequently page titles) from latin-1 to utf-8, but some or all of pages content was not converted.
This is normal; set $wgLegacyEncoding for runtime conversion of old text entries.
-- brion vibber (brion @ pobox.com)
Hi Brion,
On Wed, 16 Nov 2005 13:08:59 -0800 Brion Vibber brion@pobox.com wrote:
Andre Oliveira da Costa wrote:
If I follow it, "missing" content is there, with latin-1 chars (so we're back to the "commas" issue again). Page title is utf-8, but the remaining of the content is latin-1.
Judging by this, it seems the upgrade1_5.php script did convert URLs (and consequently page titles) from latin-1 to utf-8, but some or all of pages content was not converted.
This is normal; set $wgLegacyEncoding for runtime conversion of old text entries.
... wow, that's it? ;-) I thought it would be harder =) Thank God it's simple...
I'll try it ASAP and post here the results. However, a couple of questions:
- shouldn't upgrade1_5.php convert text entries as well? What's the point in "half converting" to utf-8?
- will $wgLegacyEncoding be around on future releases? After all, remaining non-utf8 chars will be on the DB forever since the upgrade script didn't catch them.
- would a "manual" conversion using iconv (as erchache2000 suggested on this thread) after upgrade1_5.php and update.php have been applied successfully convert all DB from latin-1 to utf-8? Could this have any side-effects that could compromise the DB? (eg. if images or any other binary data is stored on the DB, I don't know if iconv is smart enough not to mess with "latin-1 chars" it might find within binary content)
I don't mind having latin-1 chars left behind by the upgrade process as long as $wgLegacyEncoding is not a temporary workaround, so if the answer to question #2 is "yes" you can forget I asked #3... ;-)
Thks again for the help,
Andre
Andre Oliveira da Costa wrote:
Brion Vibber wrote:
This is normal; set $wgLegacyEncoding for runtime conversion of old text entries.
... wow, that's it? ;-) I thought it would be harder =) Thank God it's simple...
;)
I'll try it ASAP and post here the results. However, a couple of questions:
- shouldn't upgrade1_5.php convert text entries as well?
No.
What's the point in "half converting" to utf-8?
Ask that again when _you_ have sixteen million precompressed text records in Latin-1 and hundreds of people calling for your blood every minute the site is offline for the upgrade. ;)
- will $wgLegacyEncoding be around on future releases? After all, remaining non-utf8 chars will be on the DB forever since the upgrade script didn't catch them.
Yes.
- would a "manual" conversion using iconv (as erchache2000 suggested on this thread) after upgrade1_5.php and update.php have been applied successfully convert all DB from latin-1 to utf-8?
Depends on your database...
Could this have any side-effects that could compromise the DB? (eg. if images or any other binary data is stored on the DB, I don't know if iconv is smart enough not to mess with "latin-1 chars" it might find within binary content)
Yes, that would corrupt any compressed text entries. If you're using compressed old records you'd have to decompress the whole table before running such a conversion.
-- brion vibber (brion @ pobox.com)
Hi Brion,
On Wed, 16 Nov 2005 13:46:40 -0800 Brion Vibber brion@pobox.com wrote:
Andre Oliveira da Costa wrote:
Brion Vibber wrote:
This is normal; set $wgLegacyEncoding for runtime conversion of old text entries.
... wow, that's it? ;-) I thought it would be harder =) Thank God it's simple...
;)
I'll try it ASAP and post here the results. However, a couple of questions:
- shouldn't upgrade1_5.php convert text entries as well?
No.
What's the point in "half converting" to utf-8?
Ask that again when _you_ have sixteen million precompressed text records in Latin-1 and hundreds of people calling for your blood every minute the site is offline for the upgrade. ;)
Ok, got the message ;-) Yeah, judging from this perspective, it makes perfect sense. And, after all, all that matters is that at the end of the day conversion (be it partial or full) works, and all content is accounted for and properly accessible after the upgrade (confirmation still pending on my case! ;-))
- will $wgLegacyEncoding be around on future releases? After all, remaining non-utf8 chars will be on the DB forever since the upgrade script didn't catch them.
Yes.
Cool. I rest my case then ;-)
- would a "manual" conversion using iconv (as erchache2000 suggested on this thread) after upgrade1_5.php and update.php have been applied successfully convert all DB from latin-1 to utf-8?
Depends on your database...
Could this have any side-effects that could compromise the DB? (eg. if images or any other binary data is stored on the DB, I don't know if iconv is smart enough not to mess with "latin-1 chars" it might find within binary content)
Yes, that would corrupt any compressed text entries. If you're using compressed old records you'd have to decompress the whole table before running such a conversion.
Ok. In other words: "don't do it unless you're pretty sure you know what you're doing" ;-) I didn't even know there could be compressed text records, but they're as much of a problem on this case as any binary file. As for me, I'll let it be as it is ;-)
Thks for all the help. I will try the $wgLegacyEncoding at home and will report here the results.
Best,
Andre
PS: BTW: IMHO some of this conversation could be summarized on the UPGRADE docs, it could save others some worries.
Hi Brion,
On Thu, 17 Nov 2005 11:45:39 -0200 Andre Oliveira da Costa costa@tecgraf.puc-rio.br wrote:
[...]
Ok, got the message ;-) Yeah, judging from this perspective, it
makes perfect
sense. And, after all, all that matters is that at the end of the day conversion (be it partial or full) works, and all content is accounted for and properly accessible after the upgrade (confirmation still pending on my case! ;-))
Well, I confirmed it this weekend: setting wgLegacyEncoding to 'ISO-8859-1' (as suggested on this thread) didn't fix the latin1 + utf8 visualization problem -- pages are still coming as utf8 with latin1 chars on them, which are not displayed right by the browser, unless I force latin1 encoding.
I put this on LocalSettings.php (_after_ upgrading the DB):
~ grep wgLegacy LocalSettings.php $wgLegacyEncoding = 'ISO-8859-1';
Is this right? Should I have set this before the conversion? I restarted apache and cleared Firefox's cache, just to make sure, but there was no change on the output.
I just hope I am overlooking something simple here...
TIA
Andre
Andre Oliveira da Costa wrote:
Is this right? Should I have set this before the conversion? I restarted apache and cleared Firefox's cache, just to make sure, but there was no change on the output.
I just hope I am overlooking something simple here...
Parser cache?
-- brion vibber (brion @ pobox.com)
On Mon, 21 Nov 2005 09:29:50 -0800 Brion Vibber brion@pobox.com wrote:
Andre Oliveira da Costa wrote:
Is this right? Should I have set this before the conversion? I restarted apache and cleared Firefox's cache, just to make sure, but there was no change on the output.
I just hope I am overlooking something simple here...
Parser cache?
You mean like mmcache or something like this? I don't have any PHP accelerators installed at home...
TIA,
Andre
Andre Oliveira da Costa wrote:
On Mon, 21 Nov 2005 09:29:50 -0800 Brion Vibber brion@pobox.com wrote:
Andre Oliveira da Costa wrote:
Is this right? Should I have set this before the conversion? I restarted apache and cleared Firefox's cache, just to make sure, but there was no change on the output.
I just hope I am overlooking something simple here...
Parser cache?
You mean like mmcache or something like this? I don't have any PHP accelerators installed at home...
No, the parser cache, which caches rendered wikitext output from the parser.
-- brion vibber (brion @ pobox.com)
On Mon, 21 Nov 2005 12:55:56 -0800 Brion Vibber brion@pobox.com wrote:
Andre Oliveira da Costa wrote:
On Mon, 21 Nov 2005 09:29:50 -0800 Brion Vibber brion@pobox.com wrote:
Andre Oliveira da Costa wrote:
Is this right? Should I have set this before the conversion? I restarted apache and cleared Firefox's cache, just to make sure, but there was no change on the output.
I just hope I am overlooking something simple here...
Parser cache?
You mean like mmcache or something like this? I don't have any PHP accelerators installed at home...
No, the parser cache, which caches rendered wikitext output from the parser.
How do I clean it?
TIA
Andre
Well, im triying to upgrade and i do it on a 60%. I do all pass i put on meta.wikimedia.org, http://meta.wikimedia.org/wiki/Help:Upgrading_MediaWiki#Upgrading_to_Version....
I describe all pases there, but i have a little problem, i can access all information except recent changes (i lost all of them...) and some images that contains punctuation signs (exists on english :-S...) like camión, méxico, etc...
My database is very big and perhaps here is the problem :-(, i try to do rebuildall and so....without any success. Well, perhaps need to hit hard to computer ;-) Its looks a half stop on process...
Please next mail you send write it more clear. It will welcome ;P
Write again and good luck. And feel free to modify my comments....
erchache2000 escribió:
Well, im triying to upgrade and i do it on a 60%. I do all pass i put on meta.wikimedia.org, http://meta.wikimedia.org/wiki/Help:Upgrading_MediaWiki#Upgrading_to_Version....
I describe all pases there, but i have a little problem, i can access all information except recent changes (i lost all of them...) and some images that contains punctuation signs (exists on english :-S...) like camión, méxico, etc...
My database is very big and perhaps here is the problem :-(, i try to do rebuildall and so....without any success. Well, perhaps need to hit hard to computer ;-) Its looks a half stop on process...
Please next mail you send write it more clear. It will welcome ;P
Write again and good luck. And feel free to modify my comments.... _______________________________________________ MediaWiki-l mailing list MediaWiki-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/mediawiki-l
Uhm....perhaps im very tired and dont read last mail so good....sorry for inconveniences... :D
mediawiki-l@lists.wikimedia.org