[Wikisource-l] Text layer Djvu (was: Biodiversity Heritage Library)

6 Aug 2012


      If someone is interested,
Alex Brollo is digging into the djvu layer issue,
we have a Dropbox folder with all the files.
If you are interested in working on that, please drop me a mail.
What we can show you right now is this:
https://www.dropbox.com/s/lu6re2a02xp0nyc/Dialogo%20della%20salute%20djvu%20...
As you can see, the text is not mapped again into the djvu, but it is
"stored" all togheter in a region of the djvu page (in this case, left
angle below).
It is very difficult to re-map the text, for example because when we use
the tag <ref> for footnotes we destroy the pattern :-(
The cool thing is that the text inside is already formatted in wikitext!
https://www.dropbox.com/s/s2c0op5e9jeu47o/Dialogo%20della%20salute%20WS%20ss...
Alex assures me this is easy and just uses few scripts from djvulibre
(which is already installed in toolserver).
The same could be made uploading wiki-rendered HTML into text layer.
This could be very interesting for other websites: they could just
copy-and-paste the HTML file, or extract it with a simple python script
calling for djvuLibre routines,  and then use the Commons file as a
benchmark.
We could, maybe, give back some of our books to the Gutenberg project.
Or, maybe, give it back to GLAMs.
What do you think?
Aubrey and Alex

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

[Wikisource-l] Text layer Djvu (was: Biodiversity Heritage Library)