To note that we have the long existing tool "checker" at toollabs that will generate the transclusion listing per work
https://tools.wmflabs.org/checker
eg. https://tools.wmflabs.org/checker/?db=enwikisource_p&title=Index:Dream_d...
and enWS has been using it on its Index: ns pages for ages.
Regards, Billinghurst
On Fri, Apr 29, 2016 at 12:47 AM, Philippe Elie phil.el@free.fr wrote:
On Thu, 28 Apr 2016 at 15:55 +0200, Alex Brollo wrote:
Very interesting.
Have you any suggestion about finding the list of not transcluded pages? I can imagine, to get by a bot html of ns0 main page and all its subpages related to a Index page, then parsing it to get the list of existing page links; is there any simpler strategy?
Alex
If you have access to the database the simplest way is the code of this tool https://github.com/phil-el/phetools/blob/master/statistics/not_transcluded.p... as the function not_transcluded() is nearly what you need. I'll probably show the list of page not transcluded in a future version but this tool get such list for all index: on a wiki and the query takes a few minutes, it's not handy for a per index transclusions status.
To get such list for only one index it'll easier to use the API, 1) get all links on the Index:page filtered to namespace Page: 2) use the embededin api to get all transclusions from ns:0, result from 1) minus result from 2) are what you are searching. You can do 1) in one request and you can probably get also the proofread status with the same request as you are probably only interested in yellow or green page not transcluded, 2) is perhaps possible in only one request, I don't remember. Such tool to complement my tool can be very useful. It's possible I'll provide a simpler API on toollabs to do that.
-- phe
Wikisource-l mailing list Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikisource-l