Re: [Wikisource-l] The conversion from PDF to DJVU loses too much quality

27 Jan 2017

      It's a pretty cool format. :)
I have got the beginnings of a PHP rewrite of your python script
running https://github.com/Tpt/ia-upload/pull/18 (but it's not at all
finished yet).
What is the best way to decrease the size of the jpgs before creating
the djvu? Just scale them to ~1000 px or something? How do you handle
that? (Sorry, I've read your code, but am confused...) I'm using
imagemagick to do it, so any transformation it can do is easy to
implement.
—Sam
On Fri, 27 Jan 2017, at 03:24 PM, Alex Brollo wrote:
...
Yes, presently IA jp2.zip are the source files for all derived ones
and for OCR. All the derived ones are omologous - t.i. *relative*
coordinates of any element inside images are identical, even if image
size varies. This means that mapping of elements (images or text) can
be exported into any derived file.
...
Just an example: when an user crops an image from a djvu file by the
excellent CropTool by Danmichaelo, coordinated of the cropping  could
be used to crop high-resolution jp2 or jpg image, or to get
coordinates of any piece of  text mapped by OCR.
...
Alex
...
...
...
...
...
...
...
...
2017-01-27 0:53 GMT+01:00 Sam Wilson sam@samwilson.id.au:
...
...
__
...
...
Good to know, thanks!
...
...
...
...
So, we just stick with jp2.zip
...
...
...
...
And I love the IA magic :)
...
...
...
...
...
...
On Fri, 27 Jan 2017, at 07:40 AM, Andrea Zanni wrote:
...
...
...
AFAIK, IA always produce the jp2 files by himself.
...
...
...
I suggest GLAMs to upload zipped folders of jpegs,
...
...
...
so IA can do his magic and produce a book viewer and a PDF as well
as the jp2.
...
...
...
On Fri, Jan 27, 2017 at 12:10 AM, Sam Wilson sam@samwilson.id.au
wrote:
...
__
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
On Thu, 26 Jan 2017, at 06:35 PM, Andrea Zanni wrote:
...
...
...
...
...
The problem for me is that librarians and other people who are
genuinely interested in Wikisource and IA
don't understand why
...
...
...
...
...

they upload a good scan on IA

...
...
...
...
...

see a good book on IA, via the viewer

...
...
...
...
...

get an horrible djvu on Wikisource.

...
...
...
...
...
...
...
...
...
...
This is the issue we should try to solve, otherwise we will lose a
potential important ally, content and new userbase.
Aubrey
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
Definitely!
...
...
...
...
...
...
...
...
On a related note: most (all?) IA-scanned books have e.g. *_jp2.zip
files containing all the original scan images, but is there any
standard for user-uploaded books? Like your librarians above, I
assume they're uploading individual jpg/png files? Do these get
combined into a single zip? I'm thinking that they don't, and that
ia-upload needs to provide the option of using any of the following
sources:

.djvu
_jp2.zip (there's also _jpg.zip and _raw_jp2.zip, but I guess we
don't need to use them?)
*.jpg + *.jp2 + *.png (i.e. use all images in the item, apart
from _cover_image.jpg)
.pdf

...
...
...
...
Sound complete? Or are there other ways?
...
...
...
...
...
...
...
...

...
...
...
...
Wikisource-l mailing list
...
...
...
...
Wikisource-l@lists.wikimedia.org
...
...
...
...
https://lists.wikimedia.org/mailman/listinfo/wikisource-l
...
...
...
...
...
...
...

...
...
...
Wikisource-l mailing list
...
...
...
Wikisource-l@lists.wikimedia.org
...
...
...
https://lists.wikimedia.org/mailman/listinfo/wikisource-l
...
...
...
...
...
...

...
...
Wikisource-l mailing list
...
...
Wikisource-l@lists.wikimedia.org
...
...
https://lists.wikimedia.org/mailman/listinfo/wikisource-l
...
...
...

...
Wikisource-l mailing list
...
Wikisource-l@lists.wikimedia.org
...
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Re: [Wikisource-l] The conversion from PDF to DJVU loses too much quality