Re: [Wikitech-l] Correct way to Import SQL Dumps of Wikipedia into MediaWiki in Binary

22 May 2009

On Fri, May 22, 2009 at 6:09 PM, O. O. &lt;olson_ot(a)yahoo.com&gt; wrote:
...
  Thanks for your reply Platonides. I am trying your
suggestion right now.
 It would take a few hours to crash – if it does. (I hope sed handles
 UTF-8 correctly.) I would try yesterdays pagelinks.sql later. 
sed treats UTF-8 as a stream of bytes.  Since the pattern won't match
UTF-8 (UTF-8 only contains ASCII bytes if they represent ASCII code
points), it will just ignore those bytes.

(That sed pattern is pretty horrifying and fragile, though.  I'd
recommend something more like: sed -i 's/^) TYPE=InnoDB;$/)
TYPE=InnoDB DEFAULT CHARSET=binary;/' )

...
  $ mysql wikidb
< enwiki-20090306-pagelinks.sql 
 I am using Linux (Ubuntu).  My question is if the Shell which does the
 Pipe – would it have any effect of modifying the characters before mysql
 gets them. Right now I think the Shell supports UTF-8 – but I hope it is
 not messing things up. 
The shell is only handing the mysql command a file descriptor.  mysql
will read the file itself directly, the shell won't touch any of the
input.

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] Correct way to Import SQL Dumps of Wikipedia into MediaWiki in Binary