Today I rehearsed ScanTailor and unpaper to upload a rare book from WWI: https://archive.org/details/lettere-ferrovieri-patria I noticed the OCR is now "ABBYY FineReader 11"; IA only recently switched to 9 after many years on 8, this seems good news.
If I see correctly, 11 claims:
New and improved language support New OCR languages: Turkmen (Latin) and Old Slavonic New ICR languages: Danish, Norwegian (Bokmal & Nynorsk), Old English, Serbian (Cyrillic), Tajik Latin language has full dictionary support
Improved OCR for Chinese, Japanese & Korean
10 claims:
Improved Language Support Chinese Korean Japanese
It seems IA technology is evolving rather fast: https://blog.archive.org/2015/10/21/grant-to-develop-the-next-generation-way... https://blog.archive.org/2015/10/23/zoom-in-to-9-3-million-internet-archive-...
Nemo
wikisource-l@lists.wikimedia.org