Hi, I have setup a new OCR service on tools.wmflabs.org, it provides through some javascript hosted on wikisource.org location data of words for djvu/pdf Index:. It can be used by adding
mw.loader.load('//wikisource.org/w/index.php?title=MediaWiki:Hocr.js&action=raw&ctype=text/javascript&dontcountme=s');
to your site wide MediaWiki:Common.js or to your own User Common.js, the script works in Page: namespace, in edit or view mode. There is no user interface except double click on a word should highlight the words on the scan. I found it very useful for encyclopedia when it can be time consuming to retrieve the possition of words on the image.
As the ocr and profread text are always different, the location of word is often shifted by one or more word, location provided is only approximate.