I tried ABBY before and the quality was low,
I will try tesseract and see what happens
Best
On Tue, Jun 24, 2014 at 7:08 PM, Aleksey Chalabyan <xelgen.am(a)gmail.com>
wrote:
ABBYY FineReader supports Hebrew and Arabic since v.
11. But I'm afraid
same script is not enough. For example FineReader has 3 versions for
Armenian. All three use same scripts, different orphography and slightly
different vocabulary, but if you set wrong language drop in quality is
dramatic. So I'm not sure if Arabic OCR would work good for text in Farsi
(Persian).
FineReader provides 30 days full trial, and I think it's worth to give it
a try.
You may try to approach ABBYY and check if there are any plans on full
support of Persian in coming future.
And trying to train Teseract seems like good idea to get free/open source
OCR for Persian, if you can get enough resources on that. But I can't
comment on how well it will work with RTL scripts especially with
Nastaliq/Naskh when letters and words are not separated from each other.
On Tue, Jun 24, 2014 at 6:13 PM, Federico Leva (Nemo) <nemowiki(a)gmail.com>
wrote:
Amir Ladsgroup, 24/06/2014 15:37:
I have access to huge resources of old books in Persian (some of them
are even typed) and almost all of them can be
imported to Wikisource but
the problem is I don't have (or know) any OCR for Persian, Do you know
which OCR software supports Persian (supporting Arabic is not enough; I
checked several programs) texts?
The only result for "Persian" and OCR in abbyy website is <
http://www.abbyy.com/CaseStudies/SISU-Reveals-Its-
Multilingual-Content-to-Academic-Community-Thanks-to-
ABBYY-Recognition-Server/>, weird! Worth asking them some details, they
might have some additional plugins.
On the FLOSS side, maybe some library in Iran made some investments on
tesseract? If there's any big digital library of Persian content you should
ask them as well.
Reminder:
archive.org is still in need of people willing to compare 8.0
vs. 9.0 OCR results of some books in their language. :)
http://thread.gmane.org/gmane.org.wikimedia.wikisource/1552
Nemo
_______________________________________________
Wikisource-l mailing list
Wikisource-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l
_______________________________________________
Wikisource-l mailing list
Wikisource-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l