Opted for this approach, which is format-independent:
Add wikisourcetext.py: load text in Page ns
https://gerrit.wikimedia.org/r/#/c/283935/

On Sat, Apr 16, 2016 at 12:15 PM, Mpaa <mpaa.wiki@gmail.com> wrote:

as a matter of fact, pdf is already accepted by ProofreadPage
a script in pywikibot will not change much with respect to IA choices

On Fri, Apr 15, 2016 at 9:41 AM, Federico Leva (Nemo) <nemowiki@gmail.com> wrote:
Andrea Zanni, 15/04/2016 09:03:

I remember Alex Brollo was working with the djvu_xml layer

The XML output from ABBYY is still being published, AFAIK.

Nemo

_______________________________________________
pywikibot mailing list
pywikibot@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/pywikibot