2011/2/5 David Gerard dgerard@gmail.com
This is excellent!
What would it take to get this into place? What's the captcha load on WMF sites? Would e.g. the toolserver melt under the load? Perhaps on one project at a time?
Please consider that only a test script run - just to show that it's possible that a python script loads djvu text layer, selects doubtful words, selects the image of such words and saves them into a file. Now it's matter for excellent developers: how to select djvu files, where to upload them, how to build the database os words/images, how to build a user interface to show images and to get user input.... our it.source test documents that the first step can be done. It's so rewarding to give to that script the numbero of a djvu page, nothing re, then to see tiff images popping out into the folder... :-)
Alex