https://doc.wikimedia.org/Parsoid/master/#!/guide/jsapi also gives a nice interface to walk a document structure, including recursing into template arguments & etc. It could be made much faster by fetching content from RESTBase.
Note that links generated by templates are a sort of special case. Do you want only links which appear in the *arguments* to the template? Or do you want links are contained in the template itself? These cases are slightly different. --scott
On Wed, Sep 30, 2015 at 9:44 AM, Eric Evans eevans@wikimedia.org wrote:
On Wed, Sep 30, 2015 at 3:35 AM, Dimitrov, Dimitar < Dimitar.Dimitrov@gesis.org> wrote:
- What is the fastest way to get the html of an article for specific
revision or what is the best tool to setup local copy of Wikipedia (currently I am experimenting with Xowa and Wikitaxi).
You can use the REST API to fetch article html by revision (see: https://en.wikipedia.org/api/rest_v1/?doc).
For example: https://en.wikipedia.org/api/rest_v1/page/html/Main%20Page/664887982
The output this produces is generated by parsoid (see: https://www.mediawiki.org/wiki/Parsoid/MediaWiki_DOM_spec).
-- Eric Evans eevans@wikimedia.org _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l