Re: [Wikimediaindia-l] (OT) On the importance of Unicode

23 Feb 2011


      Dear Anivar:
...
There are Four Components
Thanks for the addendum - how important is the rendering engine in the
scheme of things? Is work on that pretty much done or are there issues
there too?
...
It is Font dependent. There is a need of Preparing Conversion maps for
each Ascii font to convert data encoded in them to unicode.
Swathanthra Malayalam Computing's Payyan's
(http://wiki.smc.org.in/Payyans ) is a tool developed for converting
ASCII to Unicode easily  for any Indic Language by building a Font map
for each needed font . This tool helped Malayalam Wiktionary to
convert many copyright expired books in non standard encodings to
Unicode
Popular Firefox extension named Padma uses similar encoding conversion
tables to display ASCII news websites in Unicode
So how do these work? They have built a map for every single ASCII
encoding/font pair (since this is some ugly hack) and the
corresponding Unicode value? There must be thousands of ASCII
encoding/font pairs right? Is this even a viable option? Are there
alternatives to this?
...
I dont think this will happen. There is a long history of lobbying for
thiswith CDAC from 2001 Onwards and nothing happened. CDAC made enough
money by selling ASCII fonts(and still makes) and They cant even think
about giving them away with a FOSS License . And during frequent terms
 they eat more government money for making yet another CD to ship with
their FOSS project forks (such ad Bhaathiya OO , IndiFox etc )+ These
fonts. In the same way most of the TDIL funding to CDAC for Indic
Language technology research does not make output at all or not
getting released, even after TDIL's policy decision to release them
under a foss license.
I can see the frustration of this - so in your opinion, an effort not
worth undertaking? Assuming they were ready to use a FOSS license, are
the fonts good enough to want to use?
...
Searching and sorting algorithms for Indic languages are in
development and are not bug free. Indic support is not yet available
in most of the search solutions (including FOSS solutions like Lucene
or Solr) because of the complex word formation characteristics.
But if I understand correctly, this is *only* possible using Unicode
encoding. Right?
Thank you, Anivar.
Best,
Gautam
________
http://social.prathambooks.org/

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Re: [Wikimediaindia-l] (OT) On the importance of Unicode