I'm trying to move a MediaWiki 1.16 instance from one hosting provider to other, and the move is breaking all page titles that have non-us ascii characters in them, in what looks like an UTF decoding problem. Strangely the page contents are fine after import, and I can create new pages that have scandinavian characters on the new provider without any issues, so I'm slightly baffled as to where the problem is.
I'm doing the move using mysqldump / import, since the MediaWiki's importDump.php barfs on the new server when trying to import an XML dump ("XML-tuonti epäonnistui jäsennysvirheen takia. rivillä 1, sarakkeessa 1 (tavu 3; "<mediawiki"): Empty document" - gotta love localized error messages).
I tried reading the SQL and it looks to me like the database contents are encoded by MediaWiki to some "safe" format, so the problem should not be with Mysql?
I can't find anything related after reading all the documentation I could find. The encoding issues that I can find refer to global encoding problems, not just with page titles.
Any pointers where to look?
Other system details: apache2, ubuntu, PHP 5.2.3.
Thanks,
sulka
Sulka Haro wrote:
I'm trying to move a MediaWiki 1.16 instance from one hosting provider to other, and the move is breaking all page titles that have non-us ascii characters in them, in what looks like an UTF decoding problem. Strangely the page contents are fine after import, and I can create new pages that have scandinavian characters on the new provider without any issues, so I'm slightly baffled as to where the problem is.
I'm doing the move using mysqldump / import, since the MediaWiki's importDump.php barfs on the new server when trying to import an XML dump ("XML-tuonti epäonnistui jäsennysvirheen takia. rivillä 1, sarakkeessa 1 (tavu 3; "<mediawiki"): Empty document" - gotta love localized error messages).
I tried reading the SQL and it looks to me like the database contents are encoded by MediaWiki to some "safe" format, so the problem should not be with Mysql?
I can't find anything related after reading all the documentation I could find. The encoding issues that I can find refer to global encoding problems, not just with page titles.
Any pointers where to look?
http://www.mediawiki.org/wiki/Manual:Backing_up_a_wiki#Character_set http://www.mediawiki.org/wiki/Manual:$wgDBmysql5
mediawiki-l@lists.wikimedia.org