Bugs item #3081100, was opened at 2010-10-04 19:53 Message generated for change (Comment added) made by nobody You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603138&aid=3081100...
Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: interwiki Group: None Status: Open Resolution: Remind Priority: 7 Private: No Submitted By: Grimlock (grimlockfr) Assigned to: xqt (xqt) Summary: Problem with hi characters
Initial Comment: Pywikipedia [http] trunk/pywikipedia (r8602, 2010/10/04, 19:33:48) Python 2.7 (r27:82525, Jul 4 2010, 09:01:59) [MSC v.1500 32 bit (Intel)] config-settings: use_api = True use_api_login = Tru
My interwiki bot on Wikipedia (using interwiki.py) can not identify correctly the interwiki link to hi, and, as a consequence, the link, which is identified as a bad one, is removed when I use -cleanup option (see here http://fr.wikipedia.org/w/index.php?title=Mark_Zuckerberg&action=history... for an example). It appears that one or more characters are misunderstood.
----------------------------------------------------------------------
Comment By: Nobody/Anonymous (nobody) Date: 2010-10-27 20:36
Message: Probably related to http://svn.python.org/view/python/branches/release26-maint/Modules/unicodeda... , and hence http://bugs.python.org/issue1054943# and http://www.unicode.org/review/pr-29.html
----------------------------------------------------------------------
Comment By: Nobody/Anonymous (nobody) Date: 2010-10-27 20:22
Message: Okay, this seems to be a python2.6/2.7 or mediawiki bug. It is related to normalizing UTF-8 strings.
Check out the following: (on py27) Python 2.7 (r27:82500, Aug 5 2010, 04:28:45) [C] on sunos5 Type "help", "copyright", "credits" or "license" for more information.
import unicodedata unicodedata.normalize('NFC', u'\u092e\u093e\u0930\u094d\u0915
\u091c\u093c\u0941\u0915\u0947\u0930\u092c\u0930\u094d\u0917') == u'\u092e\u093e\u0930\u094d\u0915 \u091c\u093c\u0941\u0915\u0947\u0930\u092c\u0930\u094d\u0917' False
(on py26):
valhallasw@willow:~/src/pywikipedia-svn$ python2.6 Python 2.6.5 (r265:79063, Jul 10 2010, 17:50:38) [C] on sunos5 Type "help", "copyright", "credits" or "license" for more information.
import unicodedata unicodedata.normalize('NFC', u'\u092e\u093e\u0930\u094d\u0915
\u091c\u093c\u0941\u0915\u0947\u0930\u092c\u0930\u094d\u0917') == u'\u092e\u093e\u0930\u094d\u0915 \u091c\u093c\u0941\u0915\u0947\u0930\u092c\u0930\u094d\u0917' True
----------------------------------------------------------------------
Comment By: tjmoel (tjmoel) Date: 2010-10-22 21:34
Message: Hi, my bot still make the mistakes http://id.wikipedia.org/w/index.php?title=Archimedes&action=historysubmi...
Any idea on how to solve ?? Thanks
----------------------------------------------------------------------
Comment By: xqt (xqt) Date: 2010-10-12 07:10
Message: Some bots are still involved to this bug: http://de.wikipedia.org/wiki/Spezial:Missbrauchsfilter-Logbuch?title=Spezial...
----------------------------------------------------------------------
Comment By: DJSasso (djsasso) Date: 2010-10-07 19:02
Message: Nevermind...I just noticed that you made a change to not remove hi links in autonomous mode.
----------------------------------------------------------------------
Comment By: DJSasso (djsasso) Date: 2010-10-07 18:38
Message: I should note this morning I updated to the most recent build and have not seen it since. And its been about 6 hours now since then. So it may have fixed itself in the most recent build. Or I may have just been lucky and not had any hi links gets mistaken in that time.
----------------------------------------------------------------------
Comment By: DJSasso (djsasso) Date: 2010-10-07 18:21
Message: Yeah look at my edits on de. I reverted a bunch of my bots changes.
http://de.wikipedia.org/wiki/Spezial:Beitr%C3%A4ge/Djsasso
----------------------------------------------------------------------
Comment By: xqt (xqt) Date: 2010-10-07 16:35
Message: Most problems came from SassoBot, MastiBot, User:ChuispastonBot, VolkowBot, see http://de.wikipedia.org/wiki/Wikipedia:Bots/Notizen#Interwiki-Probleme_mit_h...
With actual py version deleting of hi-links is stopped. Well I'll investigate your hint. Do you have some examples for me.
----------------------------------------------------------------------
Comment By: DJSasso (djsasso) Date: 2010-10-07 12:26
Message: In doing some cleanup of my bots edits on one wiki. I have seen atleast 4 other bots doing this recently. So there is clearly an issue somewhere. I was running the new -cleanup option so maybe that is what causes it.
----------------------------------------------------------------------
Comment By: DJSasso (djsasso) Date: 2010-10-07 10:33
Message: It is doing it for me as well. Has been for the last few days, but seeing as other bot seemed to fix it immediately I didn`t think it was a big issue or was maybe my machine. So I was trying to figure it out on my own. But if its happening to others its clearly not just my machine.
----------------------------------------------------------------------
Comment By: xqt (xqt) Date: 2010-10-05 13:17
Message: I found this bug this morning but now it works as expected.
----------------------------------------------------------------------
You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603138&aid=3081100...
pywikipedia-bugs@lists.wikimedia.org