The problem for me is that librarians and other people who are genuinely interested in Wikisource and IA
don't understand why
* they upload a good scan on IA
* see a good book on IA, via the viewer
* get an horrible djvu on Wikisource.

This is the issue we should try to solve, otherwise we will lose a potential important ally, content and new userbase.

Aubrey

On Thu, Jan 26, 2017 at 11:26 AM, Alex Brollo <alex.brollo@gmail.com> wrote:
By now IA pdf too are very compressed, sometimes too much - the result being impredictable; the problem is, that viewer doesn't uses djvu nor pdf IMHO, so the quality of pdf (and of resulting djvu by pdf2djvu) doesn't mirror at all the quality of viewer images. 

The IA pdf needs a good review before upload it into Commons. 

There are subltle advantages using djvu instead of pdf, i.e. fixing errors into source file (adding/deleting/moving pages, manipulating text layer); djvu is a great "wiki" format since it is open.

Alex



2017-01-25 11:35 GMT+01:00 Yann Forget <yannfo@gmail.com>:


2017-01-25 8:40 GMT+01:00 Sam Wilson <sam@samwilson.id.au>:

On Wed, 25 Jan 2017, at 03:27 PM, Andrea Zanni wrote:
On Wed, Jan 25, 2017 at 1:45 AM, Sam Wilson <sam@samwilson.id.au> wrote:

Yann, do you mean you're getting good quality DjVu generated from the PDF? Or from the original scan Jpegs?

AFAIU, Yann is using ABBYY finereader to generate a djvu and then uploads it directly to Commons. So outside of our ia-upload tool. 
Ah, okay. So if it could be done in the tool, that'd be nicer.

Yes, it is a question of settings.
Aubrey: when you say directly use the PDF, you mean for the tool to copy that across to Commons and not create a DjVu?

Yes.
If the Djvu quality is much lower than the PDF there's no reason to use the djvu over the pdf :-(
DjVu has to advantages over PDF: better compression, so small files for the same content, and better management of the text layer. 
Over if the compression is too high, the quality is not good. It is a question of a compromise between quality and size.

Yann
 
Are we saying that we *never* want to use the IA PDF? That if there's a DjVu we use it, and if there isn't we generate our own DjVu from the JP2 and djvu.xml files? Or should the tool user make this call and we give them a drop-down list of "PDF only", "Generate DjVu from PDF", and "Generate DjVu from original scans" with a note about the last of these being higher quality but slower?

I think I'm in favour of just generating a high-quality DjVu and making it simpler for the end user. But we want to be flexible too. jayantanth mentioned that he'd like to be able to just upload the PDF for example.




I can have a look at adding that feature perhaps? (Anyone else working on this?)


Please ;-)

I can try!  :-)

Aubrey
_______________________________________________
Wikisource-l mailing list


_______________________________________________
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l



_______________________________________________
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l



_______________________________________________
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l