1. I'm usng max as a limit parameter
2. I'm not sure if the dumps have the data I need. I need to get the titles
for all Articles (name space = 0), with no redirects and also need the
titles of all Categories (namespace = 14) without redirects
On Sat, May 6, 2017 at 11:39 PM Eran Rosenthal <eranroz89(a)gmail.com> wrote:
1. You can use limit parameter to get more titles in
each request
2. For getting many entries it is recommended to extract from dumps or from
database using quarry
On May 6, 2017 22:36, "Abdulfattah Safa" <fattah.safa(a)gmail.com> wrote:
for the & in $Continue=-||, it's a type.
It doesn't exist in the code.
On Sat, May 6, 2017 at 10:12 PM Abdulfattah Safa <fattah.safa(a)gmail.com>
wrote:
> I'm trying to get all the page titles in Wikipedia in namespace using
the
xml&list=allpages&apnamespace=0&apfilterredir=nonredirects&
aplimit=max&$continue=-||$apcontinue=BASE_PAGE_TITLE
>
> I keep requesting this url and checking the response if contains
continue
> tag. if yes, then I use same request but
change the *BASE_PAGE_TITLE
*to
the value
in apcontinue attribute in the response.
My applications had been running since 3 days and number of retrieved
exceeds 30M, whereas it is about 13M in the dumps.
any idea?
_______________________________________________
Wikitech-l mailing list
Wikitech-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
_______________________________________________
Wikitech-l mailing list
Wikitech-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l