Re: [Wikitech-l] how to convert the latin1 SQL dump back into UTF-8?

11 Mar 2009


      Hoi,
If you are interested in collation, you may want to look into the CLDR, it
is where the collations are registered per language. There is no such thing
as an universally correct sorting algorithm.. NB the CLDR is a UNICODE
project.
Thanks,
     GerardM
2009/3/11 Aryeh Gregor
<Simetrical+wikilist@gmail.comSimetrical%2Bwikilist@gmail.com
...
...
On Wed, Mar 11, 2009 at 6:14 AM, Daniel Kinzler daniel@brightbyte.de
wrote:
...
There is none. Sorting is done by the database. That is to say, in the
default
...
"comnpatibility" mode, binary "collation" is used - that is, byte-by-byte
comparison of UTF-8 encoded data. Which sucks. But we are stuck with it
until
...
MySQL gets proper Unicode support.
And until we upgrade to that version.  MySQL 4 doesn't have *any*
Unicode support -- or any character encoding support, in fact.  Every
is binary.
But we don't have to wait on MySQL.  We would just have to store a
Unicode sortkey in cl_sortkey instead of the actual Unicode
characters.  This would require an implementation of a Unicode sorting
algorithm in MediaWiki.  It could be language-specific or whatever you
want.

Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] how to convert the latin1 SQL dump back into UTF-8?