[Wikitech-l] Re: substr vs mb_substr

10 Jun 2005


      Brion Vibber wrote:
...
Ævar Arnfjörð Bjarmason wrote:
...
Without having actually looked at the code but it should be using the
truncate() function from the language class, however the Language.php
version of that function is not Unicode aware so stuff like this will
continue happening until bug 2069 is solved
(http://bugzilla.wikimedia.org/show_bug.cgi?id=2069)
I don't understand this claim. The LanguageUtf8 truncate *is* already
UTF-8 aware; 2069 is a code layout issue only and does not affect
functionality.
If there's a bug here, it's from failing to call the function in the
first place and letting the database crop the field.
Right, to put it another way, LanguageUtf8 is the base class for every
language class except LanguageLatin1. $wgLang->truncate() will always
use the correct encoding for the wiki, it's only if you call it with
Language::truncate() that you'll run into trouble.
Note that you can use mb_substr() in MediaWiki if you like, I
implemented a simulation of it for systems without mbstring, using the
/./u trick.
-- Tim Starling

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

[Wikitech-l] Re: substr vs mb_substr