https://bugzilla.wikimedia.org/show_bug.cgi?id=55018
Web browser: --- Bug ID: 55018 Summary: standardize_notes.py encoding Product: Pywikibot Version: unspecified Hardware: All OS: All Status: NEW Severity: enhancement Priority: Unprioritized Component: General Assignee: Pywikipedia-bugs@lists.wikimedia.org Reporter: legoktm.wikipedia@gmail.com Classification: Unclassified Mobile Platform: ---
Originally from: http://sourceforge.net/p/pywikipediabot/feature-requests/327/ Reported by: n-fran Created on: 2013-01-25 14:38:46 Subject: standardize_notes.py encoding Original description: If I want to add to the script text of russian letters, is this error:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xd0 in position 0: ordinal not in range(128)
To avoid this error, I think, it is necessary to register in the code bot these or any of the other lines:
# -*- coding: utf-8 -*- import sys reload(sys) sys.setdefaultencoding('utf-8')
And my bot started to function. Thanks.
https://bugzilla.wikimedia.org/show_bug.cgi?id=55018
--- Comment #1 from Kunal Mehta (Legoktm) legoktm.wikipedia@gmail.com --- I cannot follow what you mean with "add to the script". Do you want to modify the script or enter russian characters on the command line?
What is the complete error you got.
Did you set your transliteration_target and console_encoding in your user-config.py
reload(sys) after import sys does not matter since it just reloads the same module
https://bugzilla.wikimedia.org/show_bug.cgi?id=55018
--- Comment #2 from Kunal Mehta (Legoktm) legoktm.wikipedia@gmail.com --- Sorry, my knowledge of the English language, particularly on the part of the technical terms, it may be bad. I meant that I was putting in Russian characters in the file standardize_notes.py . For example, I changed the '\n== Notes ==\n' to '\n== Примечания ==\n' (line 987), and then this error appeared:
When I added in the beginning of the text file, which is pointed out above, the problem disappeared. Thank you.
https://bugzilla.wikimedia.org/show_bug.cgi?id=55018
--- Comment #3 from Kunal Mehta (Legoktm) legoktm.wikipedia@gmail.com --- In my user-config.py there are lines
console_encoding = 'cp1251' transliteration_target = console_encoding
but the problems with the coding still a lot. Thank you.
https://bugzilla.wikimedia.org/show_bug.cgi?id=55018
--- Comment #4 from Kunal Mehta (Legoktm) legoktm.wikipedia@gmail.com --- While using python 2.X there are two kind of stings: ASCII strings are noted like "This is a ascii string" unicode strings are noted like u"This is a unicode string"
Just write a u before that sting in line 987 (and remove that reload/encoding stuff): new_text = new_text + u'\n== Notes ==\n' # set to standard name
But ok, this part should be localized
https://bugzilla.wikimedia.org/show_bug.cgi?id=55018
--- Comment #5 from Kunal Mehta (Legoktm) legoktm.wikipedia@gmail.com --- - **priority**: 5 --> 3
https://bugzilla.wikimedia.org/show_bug.cgi?id=55018
Kunal Mehta (Legoktm) legoktm.wikipedia@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- See Also| |https://sourceforge.net/p/p | |ywikipediabot/feature-reque | |sts/327
https://bugzilla.wikimedia.org/show_bug.cgi?id=55018
John Mark Vandenberg jayvdb@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |jayvdb@gmail.com Component|General |Other scripts
https://bugzilla.wikimedia.org/show_bug.cgi?id=55018
John Mark Vandenberg jayvdb@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Version|unspecified |compat (1.0)
pywikipedia-bugs@lists.wikimedia.org