On 14 February 2012 00:46, Bináris wikiposta@gmail.com wrote:
I went back to this conversation with Russell, and tried to use it in an other way. I have console encoding problems with this command with Cyrillic letters: replace.py -catr:Венгрия . @ -lang:ru -excepttext:"[[hu:" -save:magyarok.txt -always One way is to urlencode the Russian category. Other way is to insert it into a script. (DOS batch files won't work, I already tried.) So what I did: import replace replace.main(u'-catr:Венгрия', '.', '@', '-lang:ru', '-excepttext:"[[hu:"', '-save:magyarok.txt') This results in an error message: File "C:\Pywikipedia\replace.py", line 582, in main for arg in pywikibot.handleArgs(*args): File "C:\Pywikipedia\wikipedia.py", line 7795, in handleArgs arg = _decodeArg(arg) File "C:\Pywikipedia\wikipedia.py", line 7767, in _decodeArg return unicode(arg, config.console_encoding) TypeError: decoding Unicode is not supported If I omit u from before -catr, no error is thrown, but the name is erroneously decoded. Now comes the tick! I went to line 7795 of current wikipedia.py (r9894) as shown above, and commented it out. Now my script runs perfectly! I love it!
What happens is the following. In the context of line 7767, arg= u'-catr:Венгрия' (type=Unicode). The line then tries to *decode* a Unicode string, which makes no sense: you can only decode a str representation.
The sensible solution would be to add a check, for instance something like
return arg is isinstance(arg, unicode) else unicode(arg, config.console_encoding)
(which mght not work for python 2.4, though, so having a normal if/else might be preferrable).
Merlijn