Opppss... I presume that _djvu.xml is bugged, really I only examined whole text file (deved, I think, from  _djvu.xml file). I'll take a deeper look, examining too searchable PDF. 

Alex

2017-06-30 12:20 GMT+02:00 Alex Brollo <alex.brollo@gmail.com>:
Take a look to this case: https://archive.org/details/GiacomoRacioppiLAgiografiaDiSanLaverioDel1162Images

Here OCR (as you can see from _djvu.xml file) seems severely bugged, and obviously djvu file built by IA Upload tool can't be better than source. 

Please Aubrey go on notifying me any case of faulty djvu coming from IA or coming from IA files used by IA Upload tool. 

Alex

2017-06-30 10:10 GMT+02:00 Andrea Zanni <zanni.andrea84@gmail.com>:
Unfortunately, sometimes, and apparently it's not related to the Google cover page (at least, I removed a page in a book and it doesn't have the problem. Another book indeed is disaligned, without removing the cover).

Look this:
https://it.wikisource.org/wiki/Indice:Decio_Albini_-_La_spedizione_di_Sapri,_Tip._delle_Terme_diocleziane_di_G._Balbi,_Roma_1891.djvu

On Fri, Jun 30, 2017 at 10:00 AM, Sam Wilson <sam@archives.org.au> wrote:
This is indeed a bug! I can't replicate it though. Does it happen for every book for you? Or only sometimes? Do you know what is different about the ones that fail? Is it related to removing (or not) the Google cover page?

I can find time this weekend I think, to work on this.


On Fri, 30 Jun 2017, at 03:23 PM, Andrea Zanni wrote:
Hello everyone, before talking again about this let me say that I think we have a "major" bug in the IA-upload:
sometimes, the OCR is not aligned between the pages, meaning you have the right OCR but it's shown for the following page...
Aubrey

On Thu, May 11, 2017 at 1:30 AM, Sam Wilson <sam@samwilson.id.au> wrote:

This is very cool news. :)

One possibly not-too-onerous feature would be to permit upload of other file types other than DjVu (e.g. PDF). Or there's the whole topic of creating/finding Wikidata items for the books uploaded, and updating them with the IA identifier. That'd probably require the uploading user to specify a Wikidata ID though — which is what the {{book}} template on Commons should work from anyway, in my opinion (because it can't be done via a sitelink).

I'm very happy to help with whatever I can!

—sam

On Wed, 10 May 2017, at 09:38 PM, Andrea Zanni wrote:
Dear all,
Wikimedia Italia put in its budget 3000€ for Wikisource-related work.
When we discussed this, months ago, we thought about paying a developer for
the DJVU issue of the IA-Upload tool,
which then has been resolved by our beloved Sam Wilson.

The tool is still not perfect (I often get errors), so maybe some development is still needed, but I'd ask you (especially technically skilled people like Tpt, Sam, Philippe, etc.) if you think there is some low-hanging fruit that could be reached with that kind of budget.
Of course, we will be looking for developers, so if you want to propose yourself for something, please do! ;-)

Aubrey

_______________________________________________
Wikisource-l mailing list


_______________________________________________
Wikisource-l mailing list

_______________________________________________
Wikisource-l mailing list


_______________________________________________
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l



_______________________________________________
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l