You can see a great advantage of djvu files over pdf files into the present
file list of any IA item. You can see that IA removed djvu files, but it
builds and publishes _djvu.xml file. Why? I presume that IA uses that file
to "map words" into its book viewer, since it has a good text structure
while being *pretty simple*. It can be translated into hOCR, and editing
its text nodes the edited text can be uploaded again into the djvu file.
Itsource is testing, on some texts, tricks to mass-fix djvu text layer
(removing scannos etc.) *before* uploading it into Commons.
It's a pity IMHO that this magic book format has been disregarded. Its
structure is *open* just as the pdf structure is *closed*.
Alex
2017-01-03 0:19 GMT+01:00 Sam Wilson <sam(a)samwilson.id.au>au>:
I wonder if, rather than creating a new IA item, we
should just link the
original IA item to the DjVu on Commons (via a review)? Or is there a
discoverability benefit to be had by having the DjVu also on IA?
On Tue, 3 Jan 2017, at 07:07 AM, Sam Wilson wrote:
Good idea. I guess it's not ideal to end up with two items, but at least
the 2nd will be updateable from our end.
It looks like we can add HTML links to IA reviews too, which is nice:
https://archive.org/details/spinoza_etica_paravia
On Mon, 2 Jan 2017, at 11:52 PM, Alex Brollo wrote:
Done :-)
Alex
2017-01-02 16:49 GMT+01:00 Alex Brollo <alex.brollo(a)gmail.com>om>:
Please take a look to
https://archive.org/details
/spinoza_etica_paravia_djvu, this is precisely a djvu-only item that I
uploaded some days ago. I asked for permission to create "djvu-only items"
into IA forum and I got it; this is the fiirst item I created; as you see
there's some "implicit convention" too (the name of item is the original
one + a _djvu suffix: it has been derived from
https://archive.org/details/spinoza_etica_paravia) and metadata are the
same, but a standard warning "Derived from files into L'Etica
<https://archive.org/details/spinoza_etica_paravia>" into the description
field.
So far I did not do the last step, t.i. adding a "backlink" from original
item to the derived one.
internetarchive.py allows to automatize the whole work (to download
metadata of source item, to build the new item name and to add the warning
do description field and to upload the new item).
*_______________________________________________*
Wikisource-l mailing list
Wikisource-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l
_______________________________________________
Wikisource-l mailing list
Wikisource-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l