Give me a few minutes I can get you a database dump of what you need.
On Sat, May 6, 2017 at 5:25 PM, Abdulfattah Safa fattah.safa@gmail.com wrote:
- I'm usng max as a limit parameter
- I'm not sure if the dumps have the data I need. I need to get the titles
for all Articles (name space = 0), with no redirects and also need the titles of all Categories (namespace = 14) without redirects
On Sat, May 6, 2017 at 11:39 PM Eran Rosenthal eranroz89@gmail.com wrote:
- You can use limit parameter to get more titles in each request
- For getting many entries it is recommended to extract from dumps or
from
database using quarry
On May 6, 2017 22:36, "Abdulfattah Safa" fattah.safa@gmail.com wrote:
for the & in $Continue=-||, it's a type. It doesn't exist in the code.
On Sat, May 6, 2017 at 10:12 PM Abdulfattah Safa <
fattah.safa@gmail.com>
wrote:
I'm trying to get all the page titles in Wikipedia in namespace using
the
API as following:
xml&list=allpages&apnamespace=0&apfilterredir=nonredirects& aplimit=max&$continue=-||$apcontinue=BASE_PAGE_TITLE
I keep requesting this url and checking the response if contains
continue
tag. if yes, then I use same request but change the *BASE_PAGE_TITLE
*to
the value in apcontinue attribute in the response. My applications had been running since 3 days and number of retrieved exceeds 30M, whereas it is about 13M in the dumps. any idea?
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l