jenkins-bot has submitted this change and it was merged.
Change subject: (bug 66345) solve unicode error in html2unicode ......................................................................
(bug 66345) solve unicode error in html2unicode
unichr() isn't defined for values >= 0x10000 in python 2 (I guess in py3 it is, but there are remaining problems on Windows for py < 3.4) and raises a ValueError with that remark "(narrow Python build)".
Now unicode char is created as '\Uxxxxxxxx' literal.
Change-Id: I7a38aae7a19079e7cdf93d6cc6922b2214558973 --- M pywikibot/page.py 1 file changed, 7 insertions(+), 1 deletion(-)
Approvals: Merlijn van Deen: Looks good to me, approved jenkins-bot: Verified
diff --git a/pywikibot/page.py b/pywikibot/page.py index fb55dfb..1d4c643 100644 --- a/pywikibot/page.py +++ b/pywikibot/page.py @@ -29,6 +29,7 @@
import logging import re +import sys import unicodedata import collections
@@ -3623,7 +3624,12 @@ except KeyError: pass if unicodeCodepoint and unicodeCodepoint not in ignore: - result += unichr(unicodeCodepoint) + # solve narrow Python build exception (UTF-16) + if unicodeCodepoint > sys.maxunicode: + unicode_literal = lambda n: eval("u'\U%08x'" % n) + result += unicode_literal(unicodeCodepoint) + else: + result += unichr(unicodeCodepoint) else: # Leave the entity unchanged result += text[match.start():match.end()]
pywikibot-commits@lists.wikimedia.org