jenkins-bot has submitted this change and it was merged.
Change subject: reflinks.py - UnicodeDecodeError
......................................................................
reflinks.py - UnicodeDecodeError
Try to decode page text to bypass UnicodeDecodeError exception.
Maybe there is a better way to solve this issue.
Bug: 67410
Change-Id: Ia2051a2a80851b15b1a04a135763291bd633d4e3
---
M scripts/reflinks.py
1 file changed, 3 insertions(+), 1 deletion(-)
Approvals:
John Vandenberg: Looks good to me, approved
jenkins-bot: Verified
diff --git a/scripts/reflinks.py b/scripts/reflinks.py
index 6a84381..232507b 100644
--- a/scripts/reflinks.py
+++ b/scripts/reflinks.py
@@ -681,7 +681,9 @@
elif u'.zh' in ref.link:
enc.append("gbk")
- u = linkedpagetext
+ if not 'utf-8' in enc:
+ enc.append('utf-8')
+ u = linkedpagetext.decode(enc[0]) # Bug 67410
# Retrieves the first non empty string inside <title> tags
for m in self.TITLE.finditer(u):
--
To view, visit
https://gerrit.wikimedia.org/r/144969
To unsubscribe, visit
https://gerrit.wikimedia.org/r/settings
Gerrit-MessageType: merged
Gerrit-Change-Id: Ia2051a2a80851b15b1a04a135763291bd633d4e3
Gerrit-PatchSet: 4
Gerrit-Project: pywikibot/core
Gerrit-Branch: master
Gerrit-Owner: Beta16 <l.rabinelli(a)gmail.com>
Gerrit-Reviewer: John Vandenberg <jayvdb(a)gmail.com>
Gerrit-Reviewer: Ladsgroup <ladsgroup(a)gmail.com>
Gerrit-Reviewer: Merlijn van Deen <valhallasw(a)arctus.nl>
Gerrit-Reviewer: jenkins-bot <>