Federico Leva (Nemo), 11/10/2013 08:48:
Dispenser kindly made a list of DjVu files on Commons linking an IA item, with some information like global usage: https://toolserver.org/~dispenser/temp/djvu2archive.org.txt (just change the extension to csv to open it as a spreadsheet, tab-separated). It's about 5000 books with 6-200 global usages and 5000 outside that range (which probably means completely unused apart some talk pages or whatever, or with most text already living on wiki pages). If I manage to convince a "slash-admin", I'll get those 5000 re-OCR'd, otherwise I need to do it manually so suggestions on priorities are welcome. :)
Jeff at the Internet Archive tells me they haven't tested the new OCR extensively yet, so they won't re-OCR en masse yet. I'll select a few test cases, reupload to different items and see what difference the new OCR makes: I'd use some help comparing the results for non-romance languages though... I'll also try some books in the newly supported languages: Hebrew and Thai (now with dictionary), Chinese (traditional and simplified) and Japanese.
Nemo