Platonides wrote:
O. O. wrote:
I then attempted to edit the SQL File i.e. replace the line
) TYPE=InnoDB;
With
) TYPE=InnoDB DEFAULT CHARSET=binary;
This works, in the sense that now the new Table gets created in Binary.
However I think I am making mistakes in editing the file. These files
are rather large, so I wrote code in Perl, and again in Java to do the
editing. They can manage to do the above substitution, but I am not
entirely confident about their UTF-8 handling.
You can also use sed to edit it:
$ sed -i "n;n;n;n;n;n;n;n;n;n;n;n;n;n;n;n;n;s/InnoDB/InnoDB DEFAULT
CHARSET=binary/" enwiki-20090306-pagelinks.sql
Thanks for your reply Platonides. I am trying your suggestion right now.
It would take a few hours to crash – if it does. (I hope sed handles
UTF-8 correctly.) I would try yesterdays pagelinks.sql later.
So assume if I do make the change as you suggested above i.e.
specifically set the “DEFAULT CHARSET” to binary, would there be any
problems importing using
$ mysql wikidb < enwiki-20090306-pagelinks.sql
I am using Linux (Ubuntu). My question is if the Shell which does the
Pipe – would it have any effect of modifying the characters before mysql
gets them. Right now I think the Shell supports UTF-8 – but I hope it is
not messing things up.
Thanks a lot.
O.O.