https://bugzilla.wikimedia.org/show_bug.cgi?id=59678
Web browser: --- Bug ID: 59678 Summary: Implement badtoken detection and recovery Product: Pywikibot Version: core (2.0) Hardware: All OS: All Status: NEW Severity: normal Priority: Unprioritized Component: General Assignee: Pywikipedia-bugs@lists.wikimedia.org Reporter: maarten@mdammers.nl Classification: Unclassified Mobile Platform: ---
Every once in a while I get a badtoken exception. This is probably because I have multiple bots running on the same site at the same time (race condition). * Bot A requests token -> 123 * Bot B requests token -> 123 * Bot A edits with token 123 -> ok * Bot B edits with token 123 -> poof
We could of course implement very difficult synchronization, but it doesn't happen very often so it's probably better handle it like a collision in ethernet. * Detect the badtoken * Back off for a random number of seconds * Get a new token * Do the edit Max tries should be respected so the bot can't get into a infinite retry loop.
https://bugzilla.wikimedia.org/show_bug.cgi?id=59678
--- Comment #1 from Maarten Dammers maarten@mdammers.nl --- Example:
badtoken * '''Sorry! We could not process your edit due to a loss of session data.''' Please try again. If it still does not work, try [[Special:UserLogout|logging out]] and logging back in. * There seems to be a problem with your login session; this action has been canceled as a precaution against session hijacking. Go back to the previous page, reload that page and then try again.
{u'messages': {u'1': {u'type': u'error', u'name': u'sessionfailure'}, u'0': {u't ype': u'error', u'name': u'session_fail_preview'}, u'html': {u'*': u'<ul>\n<li> <b>Sorry! We could not process your edit due to a loss of session data.</b>\n</l i>\n</ul>\n<p>Please try again.\nIf it still does not work, try <a href="/wiki/S pecial:UserLogout" title="Special:UserLogout">logging out</a> and logging back i n.\n</p>\n<ul>\n<li> There seems to be a problem with your login session;\n</li> \n</ul>\n<p>this action has been canceled as a precaution against session hijack ing.\nGo back to the previous page, reload that page and then try again.\n</p>'} }}
https://bugzilla.wikimedia.org/show_bug.cgi?id=59678
--- Comment #2 from Merlijn van Deen valhallasw@arctus.nl --- OK, so this is slightly more complicated than it seems.
There are two obvious methods: - handle the BadToken error in data/api.py. We can just self.sleep() and then get a new edit token - handle the BadToken error in data/page.py, in editpage()
Both options have their problems.
data/api.py: good: we can also handle other types of token problems bad: edit tokens also serve to detect edit conflicts, and we cannot handle those at the data/api.py level...
data/page.py: good: the logic for getting tokens & handling edit conflicts is already here! bad: the retry logic is in the data/api.py layer, and it doesn't cover other token issues
https://bugzilla.wikimedia.org/show_bug.cgi?id=59678
--- Comment #3 from Maarten Dammers maarten@mdammers.nl --- Wikidata is very unstable today so I keep running into:
File "C:\pywikibot\coredev\pywikibot\data\api.py", line 458, in submit raise APIError(code, info, **result["error"]) pywikibot.data.api.APIError: badtoken: <strong>Sorry! We could not process your edit due to a loss of session data.</strong> Please try again. If it still does not work, try [[Special:UserLogout|logging out]] and logging ba ck in. <class 'pywikibot.data.api.APIError'> CRITICAL: Waiting for 1 network thread(s) to finish. Press ctrl-c to abort
Marking this as a bug. The bot shouldn't crash on this.
https://bugzilla.wikimedia.org/show_bug.cgi?id=59678
Maarten Dammers maarten@mdammers.nl changed:
What |Removed |Added ---------------------------------------------------------------------------- Priority|Unprioritized |High Severity|normal |major
https://bugzilla.wikimedia.org/show_bug.cgi?id=59678
John Mark Vandenberg jayvdb@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |jayvdb@gmail.com, | |ricordisamoa@openmailbox.or | |g
--- Comment #4 from John Mark Vandenberg jayvdb@gmail.com --- We have a changeset pending to overhaul token management in site.py https://gerrit.wikimedia.org/r/#/c/139372/
It adds caching of tokens so, with badtoken now appearing more regularly, the cache needs better management of how long these tokens are useful for.
https://bugzilla.wikimedia.org/show_bug.cgi?id=59678
Amir Ladsgroup ladsgroup@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |ladsgroup@gmail.com
--- Comment #5 from Amir Ladsgroup ladsgroup@gmail.com ---
data/api.py: good: we can also handle other types of token problems bad: edit tokens also serve to detect edit conflicts, and we cannot handle those at the data/api.py level...
Maybe I'm wrong but edit conflict hasn't been detected by tokens, it has been detected by basetimestamp in mediawiki [1] and if edit conflict happens it raises editconflict error not badtoken error. See the error table. [1]: https://www.mediawiki.org/wiki/API:Edit
If we want to avoid undetected edit conflicts the only thing we need to do is adding basetimestamp to action=edit api calls.
https://bugzilla.wikimedia.org/show_bug.cgi?id=59678
Merlijn van Deen valhallasw@arctus.nl changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |valhallasw@arctus.nl
--- Comment #6 from Merlijn van Deen valhallasw@arctus.nl --- (In reply to John Mark Vandenberg from comment #4)
We have a changeset pending to overhaul token management in site.py https://gerrit.wikimedia.org/r/#/c/139372/
It adds caching of tokens so, with badtoken now appearing more regularly, the cache needs better management of how long these tokens are useful for.
The entire problem is /caching/ the tokens. They are not valid for a fixed time, they are valid of /one edit/. Basically it's a race condition, so there's two options:
1) the 'nice' way: implement locking. Requires some sort of interprocess communication,
2) the 'hacky' way: reduce the prevalence of the condition (by reducing the time between getting a token and using it), and retrying -- effectively using the remote MW instance as lock.
(In reply to Amir Ladsgroup from comment #5)
Maybe I'm wrong but edit conflict hasn't been detected by tokens, it has been detected by basetimestamp in mediawiki [1] and if edit conflict happens it raises editconflict error not badtoken error. See the error table. [1]: https://www.mediawiki.org/wiki/API:Edit
Yes, you are right. So we can just implement this at the api.php level.
https://bugzilla.wikimedia.org/show_bug.cgi?id=59678
--- Comment #7 from Ricordisamoa ricordisamoa@openmailbox.org --- (In reply to Merlijn van Deen from comment #6)
The entire problem is /caching/ the tokens. They are not valid for a fixed time, they are valid of /one edit/.
https://lists.wikimedia.org/pipermail/mediawiki-api-announce/2014-August/000...
«All tokens may be cached as long as the session is valid; none are dependent on factors such as the page being edited or the user being targeted.»
And some of them are always the same (e.g. editToken & protectToken). They will be merged with the change announced above.
However, since we want to be able to work with multiple account on the same wiki, we need better caching.
https://bugzilla.wikimedia.org/show_bug.cgi?id=59678
--- Comment #8 from John Mark Vandenberg jayvdb@gmail.com --- There is another patch going through review, which will help organise the framework for this.
https://gerrit.wikimedia.org/r/#/c/159394/
https://bugzilla.wikimedia.org/show_bug.cgi?id=59678
John Mark Vandenberg jayvdb@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Blocks| |70936
https://bugzilla.wikimedia.org/show_bug.cgi?id=59678
John Mark Vandenberg jayvdb@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- See Also| |https://bugzilla.wikimedia. | |org/show_bug.cgi?id=54311
https://bugzilla.wikimedia.org/show_bug.cgi?id=59678
--- Comment #9 from John Mark Vandenberg jayvdb@gmail.com --- Here is a related MW changeset to time limit the tokens https://gerrit.wikimedia.org/r/#/c/156336/
https://bugzilla.wikimedia.org/show_bug.cgi?id=59678
555 lugusto@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |lugusto@gmail.com Blocks| |35925
--- Comment #10 from 555 lugusto@gmail.com --- Adding bug 35925 to proper track this issue (is causing issues in a Wikisource specific gadget. More on the issue: https://fr.wikisource.org/w/index.php?oldid=4780982#Match_.26_Split. Further info related to the gadget: https://en.wikisource.org/wiki/Help:Match_and_split)
https://bugzilla.wikimedia.org/show_bug.cgi?id=59678
--- Comment #11 from John Mark Vandenberg jayvdb@gmail.com --- This (today) is the first time I have remembered it appearing in travis builds: https://travis-ci.org/wikimedia/pywikibot-core/jobs/40487338
pywikipedia-bugs@lists.wikimedia.org