Re: [Wikisource-l] [pywikibot] pdf library

13 May 2016

Simply, from a practital point iof view, my suggestion is: don't try to get
a good djvu from IA pdf, use instead _jp2.zip images (after conversion to
jpg the images are very good), and the result will be much better - almost
as good as images into IA viewer, that uses the same images.

Alex

2016-05-13 10:06 GMT+02:00 Federico Leva (Nemo) &lt;nemowiki(a)gmail.com&gt;om>:

...
  Alex Brollo, 13/05/2016 09:02:

  I presume that this complex structure is somewhat
similar of djvu
 background/foreground segmentation into djvu files, and artifacts are
 similar.

 Sure.

  So, pdf images are not only
"compressed", but deeply processed and
 segmented images.

 ...which is what I call "compression". I still recommend to try and
 increase the fixed-ppi parameter in such a case of excessive compression.

 I also still need an answer to https://it.wikisource.org/?diff=1733473

 Is something of this complex IA image processing path documented
  anywhere?

 What do you mean? Are you asking about details of their derivation plan
 for books? What we know has been summarised over time at
 https://en.wikisource.org/wiki/Help:DjVu_files#The_Internet_Archive , as
 always. As the help page IIRC states, the best way to understand what's
 going on is to check the item history and read the derive.php log, like
 https://catalogd.archive.org/log/487271468 which I linked.

 The main difference compared to the past is, I think, that they're no
 longer creating the luratech b/w PDF, probably because the "normal" PDF now
 manages to compress enough. They may have not realised that the single PDF
 they now produce is too compressed for illustrations and for cases where
 the original JP2 is too small.

 Nemo

 _______________________________________________
 Wikisource-l mailing list
 Wikisource-l(a)lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikisource-l

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Re: [Wikisource-l] [pywikibot] pdf library