jenkins-bot has submitted this change and it was merged.
Change subject: [bugfix] Workaround UnicodeDecodeError on api error ......................................................................
[bugfix] Workaround UnicodeDecodeError on api error
When an API error happens it tries to log the parameters but it fails as they are in a dict and the str and repr of a dict use the repr of each key and value. If the value is then a Page instance it gets in Python 2 bytes returned encoded with the console encoding. If then the str or repr of the dict is inserted into a unicode it must decode the bytes and uses by default ASCII which fails if the Page's title contains non-ASCII characters.
This patch works around that by manually decoding the bytes in Python 2 using the console encoding whenever an API error happens but does not actually fix Page's repr method so it might still fail at other places. It assumes that the rest of the str is either also encoded using that or is encoded using ASCII and that the console encoding is a superset of ASCII.
Bug: T66958 Change-Id: I298b7594599dd189211a8c268c7e094d042f40e6 --- M pywikibot/data/api.py M tests/api_tests.py 2 files changed, 15 insertions(+), 4 deletions(-)
Approvals: Xqt: Looks good to me, approved jenkins-bot: Verified
diff --git a/pywikibot/data/api.py b/pywikibot/data/api.py index 539eed3..3e5d960 100644 --- a/pywikibot/data/api.py +++ b/pywikibot/data/api.py @@ -1809,11 +1809,17 @@ pywikibot.error("Detected MediaWiki API exception %s%s" % (class_name, "; retrying" if retry else "; raising")) + # Due to bug T66958, Page's repr may return non ASCII bytes + # Get as bytes in PY2 and decode with the console encoding as + # the rest should be ASCII anyway. + param_repr = str(self._params) + if PY2: + param_repr = param_repr.decode(config.console_encoding) pywikibot.log(u"MediaWiki exception %s details:\n" u" query=\n%s\n" u" response=\n%s" % (class_name, - pprint.pformat(self._params), + pprint.pformat(param_repr), result))
if retry: @@ -1869,8 +1875,14 @@ for e in user_tokens.items()))) # raise error try: + # Due to bug T66958, Page's repr may return non ASCII bytes + # Get as bytes in PY2 and decode with the console encoding as + # the rest should be ASCII anyway. + param_repr = str(self._params) + if PY2: + param_repr = param_repr.decode(config.console_encoding) pywikibot.log(u"API Error: query=\n%s" - % pprint.pformat(self._params)) + % pprint.pformat(param_repr)) pywikibot.log(u" response=\n%s" % result)
diff --git a/tests/api_tests.py b/tests/api_tests.py index 4331e35..c3aa17f 100644 --- a/tests/api_tests.py +++ b/tests/api_tests.py @@ -29,7 +29,7 @@ DefaultSiteTestCase, DefaultDrySiteTestCase, ) -from tests.utils import allowed_failure, expected_failure_if, FakeLoginManager +from tests.utils import allowed_failure, FakeLoginManager
if not PY2: from urllib.parse import unquote_to_bytes @@ -151,7 +151,6 @@ with PatchedRequest(self._dummy_request): self.assertRaises(api.APIMWException, req.submit)
- @expected_failure_if(PY2) def test_API_error_encoding_Unicode(self): """Test a Page instance as parameter using non-ASCII chars.""" page = pywikibot.page.Page(self.site, 'Ümlä üt')
pywikibot-commits@lists.wikimedia.org