If someone is interested,
Alex Brollo is digging into the djvu layer issue,
we have a Dropbox folder with all the files.
If you are interested in working on that, please drop me a mail.
What we can show you right now is this:
As you can see, the text is not mapped again into the djvu, but it is "stored" all togheter in a region of the djvu page (in this case, left angle below).
It is very difficult to re-map the text, for example because when we use the tag <ref> for footnotes we destroy the pattern :-(
The cool thing is that the text inside is already formatted in wikitext!
Alex assures me this is easy and just uses few scripts from djvulibre (which is already installed in toolserver).
The same could be made uploading wiki-rendered HTML into text layer.
This could be very interesting for other websites: they could just copy-and-paste the HTML file, or extract it with a simple python script calling for djvuLibre routines, and then use the Commons file as a benchmark.
We could, maybe, give back some of our books to the Gutenberg project.
Or, maybe, give it back to GLAMs.
What do you think?
Aubrey and Alex