[Commons-l] [cultural-partners] [Wikisource-l] ABBYY Finereader 11 on Toolserver: do we like it?

Tue Nov 29 15:57:42 UTC 2011

On 11/28/2011 10:23 PM, Alex Brollo wrote:
> [...] FineReader 11 [...] produces a complete djvu file [...] Text 
> layer hasn't full range of details, it's organized into two levels 
> (page and line), while OCR engine on  IA servers produces a very rich 
> "tree" (page, column, region, paragraph, line and word).

Has anybody designed a web interface that shows the scanned
image and the zones or regions of the Djvu text layer? It would
look similar to image annotation on Commons,
http://commons.wikimedia.org/wiki/Commons:Image_annotations

For a Djvu file uploaded to Commons, could you automatically
generate image annotations for the various text columns and
illustrations? Does image annotation handle multi-page
document formats such as PDF and Djvu?

(Shouldn't image annotations and timed text be the same thing?)

-- 
   Lars Aronsson (lars at aronsson.se)
   Aronsson Datateknik - http://aronsson.se