jenkins-bot merged this change.
[FIX] site.preloadpages: split pagelist in at most max_ids elements
In site.preloadpages(), max_ids, i.e. the API query limit is
computed after pagelist has been splitted in chunks of groupsize
elements.
Pagelist shall be splitted in chunks of min(groupsize, max_ids)
elements.
Bug: T209111
Change-Id: I7ae76e26d2500cc8abea679b5e80465fb07690c9
---
M pywikibot/site.py
1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/pywikibot/site.py b/pywikibot/site.py
index 8033eeb..58fb89f 100644
--- a/pywikibot/site.py
+++ b/pywikibot/site.py
@@ -3370,7 +3370,13 @@
rvprop = ['ids', 'flags', 'timestamp', 'user', 'comment', 'content']
- for sublist in itergroup(pagelist, groupsize):
+ parameter = self._paraminfo.parameter('query+info', 'prop')
+ if self.logged_in() and self.has_right('apihighlimits'):
+ max_ids = int(parameter['highlimit'])
+ else:
+ max_ids = int(parameter['limit']) # T78333, T161783
+
+ for sublist in itergroup(pagelist, min(groupsize, max_ids)):
# Do not use p.pageid property as it will force page loading.
pageids = [str(p._pageid) for p in sublist
if hasattr(p, '_pageid') and p._pageid > 0]
@@ -3388,12 +3394,6 @@
rvgen = api.PropertyGenerator(props, site=self)
rvgen.set_maximum_items(-1) # suppress use of "rvlimit" parameter
- parameter = self._paraminfo.parameter('query+info', 'prop')
- if self.logged_in() and self.has_right('apihighlimits'):
- max_ids = int(parameter['highlimit'])
- else:
- max_ids = int(parameter['limit']) # T78333, T161783
-
if len(pageids) == len(sublist) and len(set(pageids)) <= max_ids:
# only use pageids if all pages have them
rvgen.request['pageids'] = set(pageids)
To view, visit change 473940. To unsubscribe, or for help writing mail filters, visit settings.