Re: [Wikimediaindia-l] (OT) On the importance of Unicode

22 Feb 2011


      On 22 February 2011 22:29, Santhosh Thottingal
santhosh.thottingal@gmail.com wrote:
...
I think you have some confusion on Unicode and Fonts. Let me try to
clarify in simple words.
Yes - I did! And thank you for such a detailed response.
To see if I have understood this - there are three components:
1. Input (Different types of keyboard layouts are used but are
independent of the method of encoding - correct?)
2. Encoding and storing the input (ASCII is the older method - have
heard of ISCII as well but do not know what that is but Unicode is the
standard.
3. Representing, visually for the human user, what has been inputed
and encoded. (Font or type faces and these are, to an extent,
independent of the encoding method used.)
...
But I know that many people still use the term "data in unicode
fonts", data in xyz font etc. This usage came into existence just
because,  before unicode was popular, most of the Indian publishers
used a non-standard way of representing our data- using English(or
latin -ascii)  data and change the font's 'face' to Indian glyph. "a
fancy dress" hack. The letter "k" will be shown as hindi "ka" with the
help of a font. ie the data is still english, but what you "see" is
Hindi.
So if I understand correctly, not only is the encoding in ASCII but
the representation of that encoding is tied to a particular font (that
was used for representation at entry?) and will only be represented
properly when using that font? However, what I am trying to understand
is whether there is consistency across the ASCII encoding? Will ka in
Hindi be encoded in ASCII only one way or is there a linkage, that I
do not understand, to the font used to represent it as well?
The reason I ask is because if ka in Hindi is always encoded the same
way irrespective of the font used to represent it, then it should not
be hard to build an ASCII to Unicode map of encoding that will only
have to be done once for each language? Though something tells me I am
way off on this assumption.
...
This is true. Fonts exist for all scripts ,  but the variety , or
quality of the existing fonts varies. Availability of fonts licensed
in foss compatible license is also a problem. For a detailed list of
Indic fonts with license info, see
http://indlinux.org/wiki/index.php/IndicFontsList
Thanks, Santosh. This is a really useful. Also, are these screen or
print ready fonts?
...
You are correct.  I would say "fonts licensed under any FOSS license"
instead of "free use/reuse".
Indeed. FOSS license is what I should have said.
...
In fact, the funds were spent(read wasted) for the development of
Proprietary fonts by government agencies like CDAC. Fonts with
free(dom) licenses were developed and maintained by FOSS developer
communities.
*sigh* In your opinion, would they be any real benefit if they did
license the ILDC series under a true FOSS license?
...
Each Unicode character is multi-byte character while in ASCII, it is
single byte.
Ah. Okay. I understand now.
...
This is not comparable since search is not possible in ascii font way
of representing data. Since the data is not in Hindi , but we just
"see" as Hindi, one cannot do a search or any such data processing on
that data.
If I understand, it is not possible to search within ASCII encoded
text but this can be done in Unicode encoded text?
Thank you very much Santosh - I have learned a lot from this.
Best,
Gautam

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Re: [Wikimediaindia-l] (OT) On the importance of Unicode