Re: [Wikitech-l] New diff feature for MediaWiki

8 Jun 2006


      On 6/8/06, Roman Nosov rnosov@gmail.com wrote:
...
Well it looks like my question about why some quotation marks do break
words and others don't will remain unanswered ("rareness" of high
numbered punctuation doesn't make it part of a word) … Anyway if such
level of supporting UTF-8 is sufficient for Mediawiki then Unicode
issue is "solved". Unicode über alles.
I think it was adequately explained - the reason why it isn't detected is
because the algorithm doesn't know it's a seperation character. So it's not
seperated. If the algorithm did know, it would be seperated properly.
So perhaps someone, like you, should submit a quick patch to that part of
the diff engine, as outlined by Tim, that makes it properly interpret that
code point. If there's a general rule or table in the Unicode standard then
implementing that might be an even better option.
The unicode site, by the way, is www.unicode.org and you can find a database
of unicode character properties here:
http://www.unicode.org/Public/UNIDATA/UnicodeData.txt
with information on interpreting them here:
http://ftp.lanet.lv/ftp/mirror/unicode/3.2-Update/UnicodeData-3.2.0.html
Enjoy!
-- 
Ben Garney
Torque Technologies Director
GarageGames.Com, Inc.

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] New diff feature for MediaWiki