[ pywikipediabot-Support Requests-1871836 ] Encoding error while processing [[en:Chişinău]]

SourceForge.net noreply at sourceforge.net
Tue Jan 15 16:21:54 UTC 2008


Support Requests item #1871836, was opened at 2008-01-15 12:27
Message generated for change (Comment added) made by rotemliss
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=603139&aid=1871836&group_id=93107

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Priority: 5
Private: No
Submitted By: André Malafaya Baptista (malafaya)
Assigned to: Nobody/Anonymous (nobody)
Summary: Encoding error while processing [[en:Chişinău]]

Initial Comment:
I'm almost sure this is not a bug so I'm putting it here in Support Requests.
When I try to process the page [[en:Chişinău]] (with bot account on 'en'), I get the following result:

X:\>interwiki.py -lang:en Chi%C5%9Fin%C4%83u

Checked for running processes. 2 processes currently running, including the curr
ent process.
Getting 1 pages from wikipedia:en...
[[Chisinau]]: [[en:Chisinau]] gives new interwiki [[lt:Kisiniovas]]
[[Chisinau]]: [[en:Chisinau]] gives new interwiki [[lv:Kisineva]]

(...output ommited deliberately...)

======Post-processing [[en:Chisinau]]======
Updating links on page [[en:Chisinau]].
Exception in Page constructor
Dump en (wikipedia) saved
Traceback (most recent call last):
  File "D:\Work\pywikipediabot-HEAD\pywikipedia\interwiki.py", line 1606, in <module>
    bot.run()
  File "D:\Work\pywikipediabot-HEAD\pywikipedia\interwiki.py", line 1381, in run
    self.queryStep()
  File "D:\Work\pywikipediabot-HEAD\pywikipedia\interwiki.py", line 1360, in queryStep
    subj.finish(self)
  File "D:\Work\pywikipediabot-HEAD\pywikipedia\interwiki.py", line 967, in finish
    if self.replaceLinks(page, new, bot):
  File "D:\Work\pywikipediabot-HEAD\pywikipedia\interwiki.py", line 1010, in replaceLinks
    ignorepage = wikipedia.Page(page.site(), iw.groups()[0])
  File "D:\Work\pywikipediabot-HEAD\pywikipedia\wikipedia.py", line 425, in __init__
    % (site, title, insite, defaultNamespace)
  File "D:\Program Files\Python\lib\encodings\cp850.py", line 12, in encode
    return codecs.charmap_encode(input,errors,encoding_map)
UnicodeEncodeError: 'charmap' codec can't encode characters in position 28-34: character maps to <undefined>
================

I belive it's an invalid UTF-8 byte sequence somewhere in the page but I'd like someone more experienced to verify this.
Thanks.


----------------------------------------------------------------------

>Comment By: Rotem Liss (rotemliss)
Date: 2008-01-15 18:21

Message:
Logged In: YES 
user_id=1327030
Originator: NO

This may be fixed in r4893 (I tried to fix a possible unicode problem in
output, and a possible KeyError for an obsolete site). If it doesn't, does
it change the output?

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=603139&aid=1871836&group_id=93107



More information about the Pywikipedia-l mailing list