Bugs item #3590676, was opened at 2012-11-28 05:00 Message generated for change (Comment added) made by dixond You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603138&aid=3590676...
Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: General Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: DixonD (dixond) Assigned to: xqt (xqt) Summary: Page._getVersionHistory returns only a part of a history
Initial Comment: There is a bug in Page._getVersionHistory. It doesn't load the whole history it it is large. The problem in here (wikipedia.py): if len(result['query']['pages'].values()[0]['revisions']) < revCount: thisHistoryDone = True
I believe it should be as following: if not getAll and len(result['query']['pages'].values()[0]['revisions']) >= revCount: thisHistoryDone = True
Version.py: Pywikipedia trunk/pywikipedia/ (r10745, 2012/11/20, 13:03:05) Python 2.7.3 (default, Apr 10 2012, 23:31:26) [MSC v.1500 32 bit (Intel)] config-settings: use_api = True use_api_login = True unicode test: ok
----------------------------------------------------------------------
Comment By: DixonD (dixond) Date: 2012-12-01 13:34
Message: Changing the condition still returns 4250 entries for me (have you missed the "not getAll and " part in my code?)
But if I use fullVersionHistory instead of getVersionHistory, it returns only 192 entries for me. I.e. try the following code:
import wikipedia as pywikibot p = pywikibot.Page('de', 'user talk:xqt') h = p.fullVersionHistory(getAll=True) print len(h)
----------------------------------------------------------------------
Comment By: xqt (xqt) Date: 2012-12-01 13:22
Message: first of all _getVersionHistory() is an internal method and you shouldn't use it directly. Use getVersionHistory() instead. The the condition is quite right. Try the following statements:
import pywikibot as pwb p = pwb.Page('de', 'user talk:xqt') h = p.getVersionHistory(getAll=True) len(h)
which gives 4250 entries (yet).
Changing the condition will return 500 entries only.
----------------------------------------------------------------------
Comment By: DixonD (dixond) Date: 2012-12-01 07:25
Message: Yes, of course. It is quite obvious that the following code won't allow to load the rest of revisions by setting thisHistoryDone to True: if len(result['query']['pages'].values()[0]['revisions']) < revCount: thisHistoryDone = True
Am I missing anything?
----------------------------------------------------------------------
Comment By: xqt (xqt) Date: 2012-12-01 00:22
Message: Are you sure that you have set getAll=True while invoking that method?
----------------------------------------------------------------------
You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603138&aid=3590676...
pywikipedia-bugs@lists.wikimedia.org