[ pywikipediabot-Support Requests-1871836 ] Encoding error while processing [[en:ChiÅinÄu]]
SourceForge.net
noreply at sourceforge.net
Tue Jan 15 18:21:59 UTC 2008
Support Requests item #1871836, was opened at 2008-01-15 10:27
Message generated for change (Comment added) made by malafaya
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603139&aid=1871836&group_id=93107
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Priority: 5
Private: No
Submitted By: André Malafaya Baptista (malafaya)
Assigned to: Nobody/Anonymous (nobody)
Summary: Encoding error while processing [[en:Chişinău]]
Initial Comment:
I'm almost sure this is not a bug so I'm putting it here in Support Requests.
When I try to process the page [[en:Chişinău]] (with bot account on 'en'), I get the following result:
X:\>interwiki.py -lang:en Chi%C5%9Fin%C4%83u
Checked for running processes. 2 processes currently running, including the curr
ent process.
Getting 1 pages from wikipedia:en...
[[Chisinau]]: [[en:Chisinau]] gives new interwiki [[lt:Kisiniovas]]
[[Chisinau]]: [[en:Chisinau]] gives new interwiki [[lv:Kisineva]]
(...output ommited deliberately...)
======Post-processing [[en:Chisinau]]======
Updating links on page [[en:Chisinau]].
Exception in Page constructor
Dump en (wikipedia) saved
Traceback (most recent call last):
File "D:\Work\pywikipediabot-HEAD\pywikipedia\interwiki.py", line 1606, in <module>
bot.run()
File "D:\Work\pywikipediabot-HEAD\pywikipedia\interwiki.py", line 1381, in run
self.queryStep()
File "D:\Work\pywikipediabot-HEAD\pywikipedia\interwiki.py", line 1360, in queryStep
subj.finish(self)
File "D:\Work\pywikipediabot-HEAD\pywikipedia\interwiki.py", line 967, in finish
if self.replaceLinks(page, new, bot):
File "D:\Work\pywikipediabot-HEAD\pywikipedia\interwiki.py", line 1010, in replaceLinks
ignorepage = wikipedia.Page(page.site(), iw.groups()[0])
File "D:\Work\pywikipediabot-HEAD\pywikipedia\wikipedia.py", line 425, in __init__
% (site, title, insite, defaultNamespace)
File "D:\Program Files\Python\lib\encodings\cp850.py", line 12, in encode
return codecs.charmap_encode(input,errors,encoding_map)
UnicodeEncodeError: 'charmap' codec can't encode characters in position 28-34: character maps to <undefined>
================
I belive it's an invalid UTF-8 byte sequence somewhere in the page but I'd like someone more experienced to verify this.
Thanks.
----------------------------------------------------------------------
>Comment By: André Malafaya Baptista (malafaya)
Date: 2008-01-15 18:21
Message:
Logged In: YES
user_id=1037345
Originator: YES
I tried r4893 (latest) and everything went exactly the same way. Call
stack and output exactly the same:
UnicodeEncodeError: 'charmap' codec can't encode characters in position
28-34: character maps to <undefined>
----------------------------------------------------------------------
Comment By: Rotem Liss (rotemliss)
Date: 2008-01-15 16:21
Message:
Logged In: YES
user_id=1327030
Originator: NO
This may be fixed in r4893 (I tried to fix a possible unicode problem in
output, and a possible KeyError for an obsolete site). If it doesn't, does
it change the output?
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603139&aid=1871836&group_id=93107
More information about the Pywikipedia-l
mailing list