Re: [Wikitech-l] substr vs mb_substr

10 Jun 2005


      On 6/9/05, Александр Сигачёв alexander.sigachov@gmail.com wrote:
...
Sal'
If summary field text is in multi-byte encoding and we want to grep
the first 150 chars of comment then a slightly strange char sometimes
appears in history
Example: http://commons.wikimedia.org/w/index.php?title=Image:Venera-7_diagram.jpg&am...
(==Описание/Description== *ru:Межпланетная автоматическая станция
«Венера-7»: 1 — панели солнечных батарей; 2 — датчик астроориентации;
3 — защитная �)
==============
I think,  It's first byte of truncated two-byte char. So, we have to
use "mb_substr" instead of "substr", is'nt it?
Without having actually looked at the code but it should be using the
truncate() function from the language class, however the Language.php
version of that function is not Unicode aware so stuff like this will
continue happening until bug 2069 is solved
(http://bugzilla.wikimedia.org/show_bug.cgi?id=2069)

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] substr vs mb_substr

Example: http://commons.wikimedia.org/w/index.php?title=Image:Venera-7_diagram.jpg&am...