Nicolas Dumazet wrote:
The reason why we we are encoding our strings/unicode objects to the site's encoding is obvious : we don't trust ourselves to think to encode properly each string before each put. And failing to properly encode a string would result in a garbage write. This has always been like this, my recent changes did not introduce that behavior.
This has worked properly in the past for me and for others, once the unicode string is supplied everything works. Even Unicode page titles work perfectly. So, yes, strings _are_ properly encoded on HTTP output.
I introduced the unicode check because a user was not understanding why a UnicodeDecodeError was triggered in the put of : text = open('file_in_utf8_with_non-ascii_chars').read() page.put(text)
Please educate your user about I/O with Python with Unicode characters :-) Adding .decode("filescharset") after read() should help. This is not a problem in pywikipedia, but in the user code.
UnicodeDecodeError is exactly the same thing you try to say with "PageNotSavedError", except that the application cannot properly catch this case. Just like Merlin wrote earlier.
Can we go back to 5801 solution?
--Marcin