[Wikisource-l] Unexpected result from a try to fix IA Upload failures

22 Dec 2017


      While trying to fix some failures of IA Upload an unexpected result
emerged: an easy opportunity of fixing some usual OCR errors into djvu text
layer.
In brief, the script xml2dsed.py
https://it.wikisource.org/wiki/Progetto:Bot/Programmi_in_Python_per_i_bot/xml2dsed.py
converts IA _djvu.xml files into a "dsed" (lisp-like) code, so that text
layer  can be uploaded into djvu file into a much faster and controllable
way using djvused.exe. While parsing the xml tree, at WORD level any word
of the text layer is exposed to the script environment as pure text; this
offers a unique opportunity to fix many scannos, avoiding any risk to mess
the xml or the dsed code.
Here the first djvu file
https://commons.wikimedia.org/wiki/File:Trattati_del_Cinquecento_sulla_donna,_1913_%E2%80%93_BEIC_1949816.djvu
where this has been successfully tested.
Alex brollo
https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail
Mail
priva di virus. www.avast.com
https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail
<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

[Wikisource-l] Unexpected result from a try to fix IA Upload failures