On Mon, Oct 2, 2017 at 6:30 PM, Roy Smith roy@panix.com wrote:
I’m not seeing how to access the wikitext for a specific revision via the API.
Something like?
curl 'https://en.wikipedia.org/w/api.php?action=query&format=json&prop=rev...'
(you probably should use a higher level api from your favourite language library in any case)
It is not possible to share those through the database protocol, not only wikitext is not stored in human-readable format, metadata and content is separated and the only feasible way to share it while maintaining user privacy/access control is if we put an application in between (mediawiki itself :-P) or if we exported it (dumps).
Hope that is helpful.
What I want to do is get the wikitext for every revision of a page.
If it is for a single page, you can define multiple revids. But if you plan to do that massively, extracting the dumps will be both faster for you and easier on the servers. There is probably close to 100 TB of plan-text wiki content among all projects.