jenkins-bot submitted this change.

View Change

Approvals: Matěj Suchánek: Looks good to me, but someone else must approve Xqt: Looks good to me, approved jenkins-bot: Verified
[IMPR] Show a more informative warning if content.decode() fails

Show a more informative warning if content.decode() fails with
UnicodeDecodeError. It is not very helpfull to get a
"Unknown or invalid encoding" message if the reason is the
content e.g. found a b'\xe4\xf6\xfc'

Change-Id: I9d534e002bec33865873b736720723f93a8e01de
---
M pywikibot/comms/http.py
1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/pywikibot/comms/http.py b/pywikibot/comms/http.py
index b31f531..5807013 100644
--- a/pywikibot/comms/http.py
+++ b/pywikibot/comms/http.py
@@ -449,14 +449,18 @@
"""Helper function to try decoding."""
if encoding is None:
return None
+
try:
content.decode(encoding)
- except (LookupError, UnicodeDecodeError):
+ except LookupError:
pywikibot.warning('Unknown or invalid encoding {!r}'
.format(encoding))
- # let chardet do the job
- return None
- return encoding
+ except UnicodeDecodeError as e:
+ pywikibot.warning('{} found in {}'.format(e, content))
+ else:
+ return encoding
+
+ return None # let chardet do the job

header_encoding = _get_encoding_from_response_headers(response)
if header_encoding is None:

To view, visit change 676281. To unsubscribe, or for help writing mail filters, visit settings.

Gerrit-Project: pywikibot/core
Gerrit-Branch: master
Gerrit-Change-Id: I9d534e002bec33865873b736720723f93a8e01de
Gerrit-Change-Number: 676281
Gerrit-PatchSet: 1
Gerrit-Owner: Xqt <info@gno.de>
Gerrit-Reviewer: Matěj Suchánek <matejsuchanek97@gmail.com>
Gerrit-Reviewer: Xqt <info@gno.de>
Gerrit-Reviewer: jenkins-bot
Gerrit-CC: Mpaa <mpaa.wiki@gmail.com>
Gerrit-MessageType: merged