We are testing a trick, useful for IA items where there's no djvu file but there's a _djvu.xml file.
_djvu.xml file is splitted into pages and uploaded "as it is" as page text. An jQuery script can parse xml and convert it into an excellent plain text. The same trick runs both in djvu and in pdf based Index pages. Another advantage is that mapped text is saved as first version of page content and that it can be recovered and used with no external tool.
While parsing xml, the same script can fix too some FineReader severe mistakes from wrong analysis of text layout (wrong splitting of text into columns/regions) using words coordinates.
Alex brollo
wikisource-l@lists.wikimedia.org