Hi All,
I have checked for Bengali Images, its works fine with 100% accuracy. Any how can it be implemented in Proofread extension?
Regards, Jayanta
---------- Forwarded message ---------- From: Subhashish Panigrahi subhashish@cis-india.org Date: Sat, Aug 29, 2015 at 3:22 PM Subject: [Wikimediaindia-l] Google's Optical Character Recognition software now works with all South Asian languages To: wikimediaindia-l@lists.wikimedia.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256
Google's OCR which apparently is most accurate OCR we have seen so far, works really good for all the major South Asian scripts: http://globalvoicesonline.org/2015/08/29/googles-optical-character-recog nition-software-now-works-with-all-south-asian-languages Here are test cases of many Indian scripts: https://goo.gl/3X75iR. Except Gurmukhi most scripts are working really good.
This could be really useful for Indian language Wikimedians and will come handy for digitization of printed and scanned text. Here is an animated tutorial for Wikimedians to use this tool for Wikisource/Wikipedia: https://commons.wikimedia.org/wiki/File:Tutorial_to_use_Google_Optical_C haracter_Recognition.gif
Please write to me if anyone wants to localize this tutorial in your language.
- -- Best! Subhashish Panigrahi Programme Officer, Access To Knowledge Centre for Internet and Society @subhapa / https://cis-india.org
_______________________________________________ Wikimediaindia-l mailing list Wikimediaindia-l@lists.wikimedia.org To unsubscribe from the list / change mailing preferences visit https://lists.wikimedia.org/mailman/listinfo/wikimediaindia-l
Thanks for forwarding.
Jayanta Nath, 30/08/2015 00:50:
Any how can it be implemented in Proofread extension?
With manual copy and paste only, as far as I can see. I don't think Google Drive has any API to extract text from uploaded files, though someone should check https://developers.google.com/google-apps/realtime/drive better than I just did.
Note that Google terms of use are very restrictive.
Nemo
Federico Leva (Nemo), 30/08/2015 09:38:
Note that Google terms of use are very restrictive.
It seems the Google Drive API doesn't have specific terms of use, only the generic ones at https://developers.google.com/terms/ Relevant passages may be:
5e. Prohibitions on Content
Unless expressly permitted by the content owner or by applicable law, you will not, and will not permit your end users or others acting on your behalf to, do the following with content returned from the APIs:
Scrape, build databases, or otherwise create permanent copies of such content, or keep cached copies longer than permitted by the cache header;
[...]
8b. Your Obligations Post-Termination
Upon any termination of the Terms or discontinuation of your access to an API, you will immediately stop using the API, cease all use of the Google Brand Features, and delete any cached or stored content that was permitted by the cache header under Section 5.
wikisource-l@lists.wikimedia.org