What is the best way to get the current text of an article from the Toolserver these days - the API, WikiProxy, or something else? I need to examine thousands of pages on a daily basis and want to streamline the process as much as possible.
Thanks,
Jason
Ja Ga schrieb:
What is the best way to get the current text of an article from the Toolserver these days - the API, WikiProxy, or something else? I need to examine thousands of pages on a daily basis and want to streamline the process as much as possible.
Use Special:Export to dump the pages (100 or so at a time).
-- daniel
On Sun, May 2, 2010 at 9:06 PM, Ja Ga jaga_x_1@yahoo.com wrote:
What is the best way to get the current text of an article from the Toolserver these days - the API, WikiProxy, or something else? I need to examine thousands of pages on a daily basis and want to streamline the process as much as possible.
Thanks,
Jason From a Toolserver account? just query the db or from a non toolserver
use, probably just the API from the wmf servers but if you cause too much load you would get blocked.
But why do you have to query several thousand pages on a daily basis?
-peachey
From a Toolserver account? just query the db or from a non toolserver use, probably just the API from the wmf servers but if you cause too much load you would get blocked.
The article text can not be fetched from the database on the toolserver.
Using the API would work just as wqell as Special:export, as long as as many pages are fetched at a time as possible. One request per page would suck.
-- daniel
toolserver-l@lists.wikimedia.org