[Wikimediaindia-l] Tamil Encyclopedia merge into Wikipedia.
Shiju Alex
shijualexonline at gmail.com
Mon Nov 15 06:32:42 UTC 2010
I have a query.
What is the license of Tamil Kalaikalanjiam? Did Tamil Nadu government or Tamil
Virtual University had officially announced that this Encyclopedia is released
in Public Domain or in some creative commons license so that we can reuse
the content. If yes, we can very well reuse the content. Otherwise it will
be copyright violation. So kindly verify this.
Let us not assume that since it is published by Government it will be in
pubic domain. In India that is not the case.
In 2008 December, Kerala Government has officially announced that it is
changing the license of similar encyclopedic project in Malayalam
(sarvavijanakosam) to Free documentation
license<http://www.gnu.org/copyleft/fdl.html>so that Malayalam wiki
community can reuse its content to develop Malayalam
wikipedia. Governmant has officially announced it. Kerala Government has
also set up its own wiki (to help us) for
Sarvavijanakosam<http://en.wikipedia.org/wiki/Sarvavijnanakosam>and
they are slowly digitizing the content and posting in its own wiki (
http://mal.sarva.gov.in). They have completed some 2,900 articles now. We
are reusing this content to enhance many of the existing articles. But we
are not copy pasting the entire content due to various reasons. The main
reason is, the content need to rewritten as per the style of wikipedia.
I really have doubt about the efficiency of current OCR softwares for
Indian languages. It is still under development. The existing solutions are
not good. I am not sure about Tamil OCR softwares.
Shiju Alex
On Mon, Nov 15, 2010 at 11:33 AM, Murali Kumar <pthooran at hotmail.com> wrote:
> Dear Wikimedia India,
>
> As you probably aware the Govt. of India, immediately post Independence
> started multiple Indian language encyclopedia projects to stream in Science
> and Technology. The Tamil language encyclopedia was completed [
> http://en.wikipedia.org/wiki/Tamil_Encyclopedia]
>
> I'm pleased to report Tamil Virtual University has scanned in the Tamil
> Kalaikalanjiam / Tamil Encyclopedia [Please see Reference 1 below].
>
> I was able to download the material via the wonderful wget command and the
> 'convert' (from imagemagick lib) in GNU/Linux. However each of the 10
> volumes is close to 700 MB without compression.
>
> I would imagine, the people behind this mammoth task (pre-internet era)
> would have liked it to be merged into a Wiki type format, which would make
> it a truly living document in-sync with the times.
>
> I do not have any experience with 1) Tamil OCR software and 2) Automated
> updates to Wikipedia.
>
> Can anyone take the lead on this project ? It will help boost the number of
> quality, articles in Indian languages. The Children's encyclopedia is being
> scanned and has a lot of great visual content.
>
> I have uploaded a sample (10 MB) PDF file at
> https://sites.google.com/site/periasamythooran/kalaikalanjiam/kalaikalanjiamWikiMergeAttempt.pdfif you are interested to give it a spin.
>
> Thanks,
>
> Murali.
>
> 1. http://www.tamilvu.org/library/libindex.htm and click on Kalaikalanjiam
> / Tamil Encyclopedia.
>
> _______________________________________________
> Wikimediaindia-l mailing list
> Wikimediaindia-l at lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikimediaindia-l
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.wikimedia.org/pipermail/wikimediaindia-l/attachments/20101115/847c2433/attachment.htm
More information about the Wikimediaindia-l
mailing list