The problem for me is that librarians and other people who are genuinely interested in Wikisource and IA don't understand why * they upload a good scan on IA * see a good book on IA, via the viewer * get an horrible djvu on Wikisource.
This is the issue we should try to solve, otherwise we will lose a potential important ally, content and new userbase.
Aubrey
On Thu, Jan 26, 2017 at 11:26 AM, Alex Brollo alex.brollo@gmail.com wrote:
By now IA pdf too are very compressed, sometimes too much - the result being impredictable; the problem is, that viewer doesn't uses djvu nor pdf IMHO, so the quality of pdf (and of resulting djvu by pdf2djvu) doesn't mirror at all the quality of viewer images.
The IA pdf needs a good review before upload it into Commons.
There are subltle advantages using djvu instead of pdf, i.e. fixing errors into source file (adding/deleting/moving pages, manipulating text layer); djvu is a great "wiki" format since it is *open*.
Alex
2017-01-25 11:35 GMT+01:00 Yann Forget yannfo@gmail.com:
2017-01-25 8:40 GMT+01:00 Sam Wilson sam@samwilson.id.au:
On Wed, 25 Jan 2017, at 03:27 PM, Andrea Zanni wrote:
On Wed, Jan 25, 2017 at 1:45 AM, Sam Wilson sam@samwilson.id.au wrote:
Yann, do you mean you're getting good quality DjVu generated from the PDF? Or from the original scan Jpegs?
AFAIU, Yann is using ABBYY finereader to generate a djvu and then uploads it directly to Commons. So outside of our ia-upload tool.
Ah, okay. So if it could be done in the tool, that'd be nicer.
Yes, it is a question of settings.
Aubrey: when you say directly use the PDF, you mean for the tool to copy that across to Commons and not create a DjVu?
Yes. If the Djvu quality is much lower than the PDF there's no reason to use the djvu over the pdf :-(
DjVu has to advantages over PDF: better compression, so small files for
the same content, and better management of the text layer. Over if the compression is too high, the quality is not good. It is a question of a compromise between quality and size.
Yann
Are we saying that we *never* want to use the IA PDF? That if there's a DjVu we use it, and if there isn't we generate our own DjVu from the JP2 and djvu.xml files? Or should the tool user make this call and we give them a drop-down list of "PDF only", "Generate DjVu from PDF", and "Generate DjVu from original scans" with a note about the last of these being higher quality but slower?
I think I'm in favour of just generating a high-quality DjVu and making it simpler for the end user. But we want to be flexible too. jayantanth mentioned https://github.com/Tpt/ia-upload/issues/15 that he'd like to be able to just upload the PDF for example.
I can have a look at adding that feature perhaps? (Anyone else working on this?)
Please ;-)
I can try! :-)
Aubrey *_______________________________________________* Wikisource-l mailing list Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikisource-l
Wikisource-l mailing list Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikisource-l
Wikisource-l mailing list Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikisource-l
Wikisource-l mailing list Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikisource-l