I've just noticed
http://meta.wikimedia.org/wiki/Wikimania_2012_Wikisource_roadmap
and would like to comment the passage
Bień, Janusz S. (2011) Efficient search in hidden text of large DjVu documents. In: Advanced Language Technologies for Digital Libraries. Lecture Notes in Computer Science (Theoretical Computer Science and General Issues) (6699). Springer, pp. 1-14. ISBN 978-3-642-23159-9
Januz also suggested this graphical editor for Djvu. He said they are working on a proof of prototype tool for a similar purpose.
The paper mentioned above is available freely as a self-archived copy, just click on the title and choose the last file or go directly to our digital library at http://bc.klf.uw.edu.pl/177/. Some other presentations related to the tools are also available in this library, you can find examples and screenshots useful even if the text is in Polish.
More recent and complete information is available now at the project site
https://bitbucket.org/jsbien/ndt/wiki/wyniki
It includes in particular the source code and the links to virtual machines with the software installed.
The experimental proofing and transcribing tools are also available there. Unfortunately the grant came to the end and the further development can be done only by volunteers.
I would like also to draw your attention to the slides and the recording
http://bc.klf.uw.edu.pl/298/ https://sas.elluminate.com/drtbl?sid=2008350&suid=D.EACBAF1295BF570E4ADA...
which describe in English several programs made available by the project.
Best regards
Janusz
wikitech-l@lists.wikimedia.org