Sorry for offtop:
Kishan, and is there any straightforward way to know how many pages a
DjVu/PDF contains, through some API call?
By straightforward I mean not asking for new pages, till you start getting
On Sat, May 17, 2014 at 1:51 AM, Gaurav Vaidya <gaurav(a)ggvaidya.com> wrote:
On 16 May, 2014, at 11:27 am, Kishan Thobhani <thobhanikishan(a)gmail.com>
I was redirected here by Sumana Harihareswara
with a proposed task of
documenting API for ProofreadPage extension (
) and later helping
to improve same.
At this point ProofreadPage (prp) API adds 2 hooks over API under
1.) Properties - prop=proofread ( This is to get Proofreading level of
2.) Meta - meta=proofreadinfo ( Local
Configuration Information )
In context, i would really appriciate if someone can share thier
thoughts and help
me compile notes to proceed further.
Points could include:-
1.) Use-case of API.
2.) Existing components/projects/bots already using proofread API
3.) Anything else.
You can do a lot with ProofreadPage without any new APIs. For example, I
wrote a Perl module to download an entire book from the English Wikisource
as WikiText two years ago. At that time, I implemented it for a
hypothetical “Index:Entire book.pdf” by:
1. Using prop=imageinfo to get the number of pages for “File:Entire
2. Using prop=revisions to download the Wikitext for each individual page
from “Page:Entire book.djvu/1” to “Page:Entire book.djvu/9999” (if the
image had 9,999 pages).
This will work for Wikisources that redirect “File:”, “Index:” and “Page:”
into their local namespaces. I ignored the proofread status entirely, since
all the pages I needed to download had already been transcribed, but I
guess it might be helpful to have an API query that could return the
proofread status for every page in an Index page. That’s the only idea I
have for now!
If you’re curious, the Perl code I wrote is available at
Wikisource-l mailing list