[Mediawiki-l] MySQL 4.1 & MediaWiki backup corruption warning
Brion Vibber
brion at pobox.com
Thu Nov 18 09:30:00 UTC 2004
Just a note of warning for those of you using MySQL 4.1: changes in the
new charset options may result in mysqldump outputting bogus data into
backups which can't be restored without data loss.*
This may affect some Unicode text, and certainly can irretrievably
corrupt compressed old revision text (using $wgCompressRevisions
option). If you're using MySQL 4.1, you should probably examine and
test your backup dumps to make sure they can be restored and used
successfully.
Passing an option like --default-character-set=latin1 may stop
mysqldump from trying to 'convert' (and thus corrupt) your data. (If
your server is not set to the defaults, this may or may not be the
correct value for you.) In the future hopefully we'll be able to play
nicer with the new character set settings, but for now MediaWiki
follows prior practice for older versions of MySQL where there was (and
remains) no ability to correctly indicate the charset used in a
particular database, table, or field.
* Specifically, a default "latin-1" to UTF-8 conversion silently
corrupts all bytes with the values 0x81, 0x8d, 0x8f, 0x90, or 0x9d by
turning them into literal question marks. The question marks cannot be
returned to their original byte values when the data is re-imported.
-- brion vibber (brion @ pobox.com)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: PGP.sig
Type: application/pgp-signature
Size: 186 bytes
Desc: This is a digitally signed message part
Url : http://lists.wikimedia.org/pipermail/mediawiki-l/attachments/20041118/6eed6081/attachment.pgp
More information about the MediaWiki-l
mailing list