@Alex since IA is not using djvu any longer, on en.wikisource there is demand of a script similar to djvutxt.py for pdf (or it could be a single one for both formats ...)
On Thu, Apr 14, 2016 at 7:35 PM, John Mark Vandenberg jayvdb@gmail.com wrote:
On 14 Apr 2016 02:18, "Mpaa" mpaa.wiki@gmail.com wrote:
Hi.
Is there any preference for a python pdf library, in case one would like
to add pdf file processing to pywikibot?
I have no preference, or experience. Pypdf2 seems to be the most commonly used. There are a few worrying Python 3 encoding bugs.
Or is it good enough, if possible, to rely on pdfinfo (which I guess is
linux-only)?
For a script, using pdfinfo is definately good enough IMO. reflinks uses pdfinfo.
There are precompiled binaries available for windows at http://www.foolabs.com/xpdf/download.html
-- John
pywikibot mailing list pywikibot@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/pywikibot