jenkins-bot has submitted this change and it was merged. (
https://gerrit.wikimedia.org/r/407590 )
Change subject: diff_checker.py: Decode tokenizer strings using 'utf-8' encoding
on Python 2
......................................................................
diff_checker.py: Decode tokenizer strings using 'utf-8' encoding on Python 2
Apparently the tokenizer on Python3 has an internal mechanism to detect the
right encoding and returns unicode objects.[1] But the tokenizer on Python 2
returns byte-strings which need to be explicitly decoded, otherwise the
default encoding (sometimes 'ascii') is used that causes UnicodeDecodeError.
[1] See:
https://docs.python.org/3/library/tokenize.html#tokenize.detect_encoding
Bug: T186301
Change-Id: I029ae20145bb634c72e2f7f24b8c749d5885fb25
---
M scripts/maintenance/diff_checker.py
1 file changed, 4 insertions(+), 0 deletions(-)
Approvals:
jenkins-bot: Verified
Xqt: Looks good to me, approved
diff --git a/scripts/maintenance/diff_checker.py b/scripts/maintenance/diff_checker.py
index 734befd..a055143 100644
--- a/scripts/maintenance/diff_checker.py
+++ b/scripts/maintenance/diff_checker.py
@@ -30,8 +30,10 @@
from subprocess import check_output
from sys import version_info
if version_info.major == 3:
+ PY2 = False
from tokenize import tokenize, STRING
else:
+ PY2 = True
from tokenize import generate_tokens as tokenize, STRING
from unidiff import PatchSet
@@ -72,6 +74,8 @@
break
if start[0] not in line_nos or type_ != STRING:
continue
+ if PY2:
+ string = string.decode('utf-8')
match = STRING_MATCH(string)
if match.group('unicode_literal'):
error = True
--
To view, visit
https://gerrit.wikimedia.org/r/407590
To unsubscribe, visit
https://gerrit.wikimedia.org/r/settings
Gerrit-MessageType: merged
Gerrit-Change-Id: I029ae20145bb634c72e2f7f24b8c749d5885fb25
Gerrit-PatchSet: 4
Gerrit-Project: pywikibot/core
Gerrit-Branch: master
Gerrit-Owner: Dalba <dalba.wiki(a)gmail.com>
Gerrit-Reviewer: Dalba <dalba.wiki(a)gmail.com>
Gerrit-Reviewer: Xqt <info(a)gno.de>
Gerrit-Reviewer: jenkins-bot <>