Hi,
with mediawiki 1.9.3, I have a big charset problem with german and french special caracters. I begin with a basic question:
in the database, a word like "Université" ist stored as "Université" in "page_title - varchar(255) - latin1_bin". Is that correct ? I suppose no. Why is collation "latin1_bin" but phpmyadmin says "MySQL charset: UTF-8 Unicode (utf8)" ?
In another datase of a personal php-site, the same word is stored as "Université" in "varchar(100) latin1_swedish_ci" and this works.
In the wiki, all internal links with caracters like "é", "ä" and so on don't work after export/import by phpmyadmin.
cheers
Klaus
Klaus Becker wrote:
Hi,
with mediawiki 1.9.3, I have a big charset problem with german and french special caracters. I begin with a basic question:
in the database, a word like "Université" ist stored as "Université" in "page_title - varchar(255) - latin1_bin". Is that correct ? I suppose no. Why is collation "latin1_bin" but phpmyadmin says "MySQL charset: UTF-8 Unicode (utf8)" ?
Storing in latin1_bin is correct. mysql utf8 support is not "as good as it should" (and wasn't always available) so MediaWiki stores the utf8 characters in a latin1 table. Université should be stored as Université
On your case, Université has been encoded again as utf8, as i warned you a week ago Likely caused by "intelligent" db dumpers.
In another datase of a personal php-site, the same word is stored as "Université" in "varchar(100) latin1_swedish_ci" and this works.
In the wiki, all internal links with caracters like "é", "ä" and so on don't work after export/import by phpmyadmin.
Because the titles on page table are broken.
cheers
Klaus
Am Montag, 14. Mai 2007 21:44 schrieb Platonides:
Klaus Becker wrote:
Hi,
with mediawiki 1.9.3, I have a big charset problem with german and french special caracters. I begin with a basic question:
in the database, a word like "Université" ist stored as "Université" in "page_title - varchar(255) - latin1_bin". Is that correct ? I suppose no. Why is collation "latin1_bin" but phpmyadmin says "MySQL charset: UTF-8 Unicode (utf8)" ?
Storing in latin1_bin is correct. mysql utf8 support is not "as good as it should" (and wasn't always available) so MediaWiki stores the utf8 characters in a latin1 table. Université should be stored as Université
On your case, Université has been encoded again as utf8, as i warned you a week ago Likely caused by "intelligent" db dumpers.
In another datase of a personal php-site, the same word is stored as "Université" in "varchar(100) latin1_swedish_ci" and this works.
In the wiki, all internal links with caracters like "é", "ä" and so on don't work after export/import by phpmyadmin.
Because the titles on page table are broken.
Thanks for explanation. I red some articles I found on the web and I understand better now. In another mail I told that the problem is resolved by using mysqldump instead of phpmyadmin for backing up db.
cheers Klaus
mediawiki-l@lists.wikimedia.org