Decoding strings issue in PWB - pywikibot

23 Jun 2017

Hi.
Do PWB has issues with decoding URL strings?

Try this script:
from __future__ import absolute_import, unicode_literals

import re, urllib
import pywikibot

mylist = \
    [
                u"Åge Hovengen",
                u"Åge Konradsen",
                u"Åge Ramberg",
    ]

for a in mylist:
    ssite = pywikibot.getSite("en")
    spage = pywikibot.Page(ssite, a)
    text = spage.get()
    m0 =
re.search(ur"\{\{\s*Stortingetbio\s*\|\s*(?:id=)?\s*([^\s}\|]+)\s*[\|\}]", text,
flags=re.IGNORECASE)
    if m0:
        m = m0.group(1)
        test1 = urllib.unquote(m)
        test2 = urllib.unquote_plus(m)
        test3 = m.decode('utf8')
        test4 = m.encode('utf8')
        pywikibot.output(test1)
        pywikibot.output(test2)
        pywikibot.output(test3)
        pywikibot.output(test4)

It doesn't decode for me %c3%85 to ÅWhile on http://repl.it/Izdw/2 you can see that
pure python can decode that string sequence with urllib.unquote and urllib.unquote_plus.Is
this a PWB bug or what?