[Wikimediaindia-l] (OT) On the importance of Unicode

Santhosh Thottingal santhosh.thottingal at gmail.com
Tue Feb 22 16:59:57 UTC 2011


On Thu, Feb 17, 2011 at 11:29 AM, Gautam John <gautam at prathambooks.org> wrote:
> 2. Given that we publish in Indian languages, using Unicode fonts are
> the only way to achieve cross-platform interoperability and is a
> global standard.
> 3. Given India's push towards copyright reform for the print impaired,
> it is imperative that Unicode fonts be used in the creation of Indic
> content because it is otherwise a huge barrier to conversion to
> print-friendly formats.
> 4. Unicode, being an open global standard guarantees content
> accessibility in the future and ensures no proprietary font and vendor
> lock in.

I think you have some confusion on Unicode and Fonts. Let me try to
clarify in simple words.
Unicode is an encoding standard. it says how a 'letter' is represented
by a group of bits or bytes. And it ensures a uniqueness for each of
the letters across thousands of languages in the world.
Fonts are just "clothes" for these data.  sometimes optimized for web,
sometimes for print. sometimes fancy... Data can exist without fonts
too. Only thing is one cannot "see" the data properly.or you see them
naked(as question marks, squares or raw code points depending on your
operating system environment)

So if you say 'using unicode fonts for indic content", it does not
make sense. we cannot represent or "store" data in fonts. or when you
say "unicode fonts are the only way to achieve interoperability:, it
is wrong since it is "encoding standard" makes interoperability
possible.

Unicode data does not have dependency on the font. Font is users
choice and it is at readers side.

But I know that many people still use the term "data in unicode
fonts", data in xyz font etc. This usage came into existence just
because,  before unicode was popular, most of the Indian publishers
used a non-standard way of representing our data- using English(or
latin -ascii)  data and change the font's 'face' to Indian glyph. "a
fancy dress" hack. The letter "k" will be shown as hindi "ka" with the
help of a font. ie the data is still english, but what you "see" is
Hindi.
Obviously the data  cannot be presented to anybody without this
"special clothes". If you get this data and don't have the associated
font, what you see will be just some junk latin characters. Many
publishers created their own fonts with this technique in their own
way. So to send some data to your friend, you need to tell him that,
hey, this data is in Sree Font.. this data is in Kathika font etc.
Even after Unicode is popular, a very small percentage of publishers
moved to Unicode, and others still continue with ASCII font dependent
data.

If one uses Unicode,  no need to mention about font. One can read it
using a good "unicode compatible" font of his/her choice.

So "data is in unicode encoding" is correct. "data is in unicode font"
is wrong. "data can be viewed using any unicode compatible font" is
correct.
I hope it is clear.

> 5. The limitation is on the lack of high quality and varied typefaces
> that are both screen and print optimised open type Indic Unicode
> fonts.

This is true. Fonts exist for all scripts ,  but the variety , or
quality of the existing fonts varies. Availability of fonts licensed
in foss compatible license is also a problem. For a detailed list of
Indic fonts with license info, see
http://indlinux.org/wiki/index.php/IndicFontsList


> 6. Given the importance of linguistic diversity to India's cultural
> heritage, it is imperative that greater attention is paid to the
> development of such fonts under licenses that allow for free re-use
> and to fix issues in the fonts that might arise.

You are correct.  I would say "fonts licensed under any FOSS license"
instead of "free use/reuse".

> 7. The Govt. should fund the open development of at least 5 such fonts
> for each the 21 Constitutionally recognised languages and make these
> available not just for free, but under free license to re-use and
> improve as well.

You got it. But history shows that such funding did not play much role
in development of the fonts listed here:
http://indlinux.org/wiki/index.php/IndicFontsList
In fact, the funds were spent(read wasted) for the development of
Proprietary fonts by government agencies like CDAC. Fonts with
free(dom) licenses were developed and maintained by FOSS developer
communities.


Thanks
Santhosh Thottingal
http://thottingal.in



More information about the Wikimediaindia-l mailing list