Nevertheless consider the file structure inside archive.org, who collects images into zip files and text into _djvu.xml files, so allowing to manage its brilliant viewer.

Djvu format really can be used as a compact images+xml container, but it seems an obsolete file format, as recent discontinuation of output by archive.org suggests. Pdf is IMHO too complex and can't be considered an open format.

Alex brollo

Il giorno sab 6 lug 2019 alle ore 10:51 David Starner <prosfilaes@gmail.com> ha scritto:

From my perspective, a DjVu or PDF file is just an archive format for
images. Any text that comes along with them is ancillary; if it's
missing, we can always generate it from OCR. I could just as well use
CBR/CBZ files, though they're not as reliable for having a sensible
format. I want to avoid, as much as possible, dealing with a bunch of
disconnected page images, because that maximizes the possibility for
human error.

--
Kie ekzistas vivo, ekzistas espero.

_______________________________________________
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l