Patches item #3355772, was opened at 2011-07-06 07:49
Message generated for change (Tracker Item Submitted) made by loxley
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=3355772&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: K.-M. Hansche (loxley)
Assigned to: Nobody/Anonymous (nobody)
Summary: Spellcheck.py – Print title
Initial Comment:
Makes spellcheck.py print the title of the active page when asking for input.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=3355772&group_…
Patches item #3355767, was opened at 2011-07-06 07:47
Message generated for change (Tracker Item Submitted) made by loxley
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=3355767&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Translations
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: K.-M. Hansche (loxley)
Assigned to: Nobody/Anonymous (nobody)
Summary: German message for spellcheck.py
Initial Comment:
German message for spellcheck.py
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=3355767&group_…
Bugs item #3081100, was opened at 2010-10-04 21:53
Message generated for change (Comment added) made by xqt
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=3081100&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: interwiki
Group: None
>Status: Pending
Resolution: Wont Fix
Priority: 7
Private: No
Submitted By: Grimlock (grimlockfr)
Assigned to: xqt (xqt)
Summary: Unicode bug: some page titles are mangled
Initial Comment:
Pywikipedia [http] trunk/pywikipedia (r8602, 2010/10/04, 19:33:48)
Python 2.7 (r27:82525, Jul 4 2010, 09:01:59) [MSC v.1500 32 bit (Intel)]
config-settings:
use_api = True
use_api_login = Tru
My interwiki bot on Wikipedia (using interwiki.py) can not identify correctly the interwiki link to hi, and, as a consequence, the link, which is identified as a bad one, is removed when I use -cleanup option (see here http://fr.wikipedia.org/w/index.php?title=Mark_Zuckerberg&action=historysub… for an example). It appears that one or more characters are misunderstood.
----------------------------------------------------------------------
>Comment By: xqt (xqt)
Date: 2011-07-03 15:26
Message:
Python 2.7.2 has been released at Sun, 12 June 2011. This release does no
longer trigger unicode bug 3081100, which happened for characters with
multiple accents (for example on hak-, hi-, cdo- and sa-wiki). I guess it
is highly recommended to migrate to this new release if the local version
has this bug.
Could we close this tracker?
----------------------------------------------------------------------
Comment By: Merlijn S. van Deen (valhallasw)
Date: 2011-03-16 10:39
Message:
I cannot edit details, but I have edited the summary to be a bit more
descriptive.
----------------------------------------------------------------------
Comment By: Nemo (nemobis)
Date: 2011-03-16 09:35
Message:
Thank you. Could you please make the bug subject more descriptive? Even
reading all comments I wasn't able to understand completely, and it would
be better if bot runners, who are sent to this bug by interwiki.py, could
understand what's the problem and take the necessary measures (e.g. not
using -force or -cleanup, I suppose). Thank you very much!
----------------------------------------------------------------------
Comment By: Merlijn S. van Deen (valhallasw)
Date: 2011-03-16 09:27
Message:
It happens for any page title where the (correct) mediawiki unicode
normalization does not equal the (incorrect) python normalization. As a
general guideline, this only happens for characters with multiple accents
(say, 3 or so) - this does not only happen for hi:, though!
I think most latin and cyrillic character sets generally are safe. For
others, I have no idea - we have had reports for several languages.
----------------------------------------------------------------------
Comment By: Nemo (nemobis)
Date: 2011-03-16 09:10
Message:
Does this bug affect other languages as well or is it safe to use
pywikipedia with this problem if you don't touch hi links?
----------------------------------------------------------------------
Comment By: Grimlock (grimlockfr)
Date: 2010-11-02 17:03
Message:
I used Python 2.7 when I discovered this bug. The bug is not fixed in 2.7
(or in all 2.7 distributions ..)
----------------------------------------------------------------------
Comment By: Merlijn S. van Deen (valhallasw)
Date: 2010-11-02 15:47
Message:
Just a quick update: upstream has confirmed this is a bug in the python
library. It should get fixed in 2.7 and 3.2, but it is not clear yet
whether 2.6.6 will have the fix included.
----------------------------------------------------------------------
Comment By: Merlijn S. van Deen (valhallasw)
Date: 2010-10-30 17:43
Message:
Reported to the python developers: http://bugs.python.org/issue10254
----------------------------------------------------------------------
Comment By: Merlijn S. van Deen (valhallasw)
Date: 2010-10-30 16:52
Message:
C# test code: http://pastebin.ca/1977261
This does not show this regression. The C# library does not show PR29
issues.
I will file a bug with the python developers about this shortly.
----------------------------------------------------------------------
Comment By: Merlijn S. van Deen (valhallasw)
Date: 2010-10-27 23:16
Message:
One last comment: the problem does not appear in python < 2.6.5. Consider
using an older python version if you work on wikimedia sites.
Added warning in r8687.
----------------------------------------------------------------------
Comment By: Merlijn S. van Deen (valhallasw)
Date: 2010-10-27 22:54
Message:
The last comments were also mine.
Mediawiki does not show problems related to PR29:
<?php
include_once('UtfNormal.php');
print bin2hex("\xe0\xad\x87\xcc\x80\xe0\xac\xbe") . "\n";
print bin2hex(UtfNormal::cleanUp("\xe0\xad\x87\xcc\x80\xe0\xac\xbe")) .
"\n";
returns the expected
e0ad87cc80e0acbe
e0ad87cc80e0acbe
where no information loss is happening. This means it might be a bug
introduced in the fix for pr29 in unicodedata.c.
----------------------------------------------------------------------
Comment By: Nobody/Anonymous (nobody)
Date: 2010-10-27 22:36
Message:
Probably related to
http://svn.python.org/view/python/branches/release26-maint/Modules/unicoded…
, and hence
http://bugs.python.org/issue1054943#
and
http://www.unicode.org/review/pr-29.html
----------------------------------------------------------------------
Comment By: Nobody/Anonymous (nobody)
Date: 2010-10-27 22:22
Message:
Okay, this seems to be a python2.6/2.7 or mediawiki bug. It is related to
normalizing UTF-8 strings.
Check out the following:
(on py27)
Python 2.7 (r27:82500, Aug 5 2010, 04:28:45) [C] on sunos5
Type "help", "copyright", "credits" or "license" for more information.
>>> import unicodedata
>>> unicodedata.normalize('NFC', u'\u092e\u093e\u0930\u094d\u0915
\u091c\u093c\u0941\u0915\u0947\u0930\u092c\u0930\u094d\u0917') ==
u'\u092e\u093e\u0930\u094d\u0915
\u091c\u093c\u0941\u0915\u0947\u0930\u092c\u0930\u094d\u0917'
False
(on py26):
valhallasw@willow:~/src/pywikipedia-svn$ python2.6
Python 2.6.5 (r265:79063, Jul 10 2010, 17:50:38) [C] on sunos5
Type "help", "copyright", "credits" or "license" for more information.
>>> import unicodedata
>>> unicodedata.normalize('NFC', u'\u092e\u093e\u0930\u094d\u0915
\u091c\u093c\u0941\u0915\u0947\u0930\u092c\u0930\u094d\u0917') ==
u'\u092e\u093e\u0930\u094d\u0915
\u091c\u093c\u0941\u0915\u0947\u0930\u092c\u0930\u094d\u0917'
True
----------------------------------------------------------------------
Comment By: tjmoel (tjmoel)
Date: 2010-10-22 23:34
Message:
Hi, my bot still make the mistakes
http://id.wikipedia.org/w/index.php?title=Archimedes&action=historysubmit&d…
Any idea on how to solve ?? Thanks
----------------------------------------------------------------------
Comment By: xqt (xqt)
Date: 2010-10-12 09:10
Message:
Some bots are still involved to this bug:
http://de.wikipedia.org/wiki/Spezial:Missbrauchsfilter-Logbuch?title=Spezia…
----------------------------------------------------------------------
Comment By: DJSasso (djsasso)
Date: 2010-10-07 21:02
Message:
Nevermind...I just noticed that you made a change to not remove hi links in
autonomous mode.
----------------------------------------------------------------------
Comment By: DJSasso (djsasso)
Date: 2010-10-07 20:38
Message:
I should note this morning I updated to the most recent build and have not
seen it since. And its been about 6 hours now since then. So it may have
fixed itself in the most recent build. Or I may have just been lucky and
not had any hi links gets mistaken in that time.
----------------------------------------------------------------------
Comment By: DJSasso (djsasso)
Date: 2010-10-07 20:21
Message:
Yeah look at my edits on de. I reverted a bunch of my bots changes.
http://de.wikipedia.org/wiki/Spezial:Beitr%C3%A4ge/Djsasso
----------------------------------------------------------------------
Comment By: xqt (xqt)
Date: 2010-10-07 18:35
Message:
Most problems came from SassoBot, MastiBot, User:ChuispastonBot,
VolkowBot, see
http://de.wikipedia.org/wiki/Wikipedia:Bots/Notizen#Interwiki-Probleme_mit_…
With actual py version deleting of hi-links is stopped. Well I'll
investigate your hint. Do you have some examples for me.
----------------------------------------------------------------------
Comment By: DJSasso (djsasso)
Date: 2010-10-07 14:26
Message:
In doing some cleanup of my bots edits on one wiki. I have seen atleast 4
other bots doing this recently. So there is clearly an issue somewhere. I
was running the new -cleanup option so maybe that is what causes it.
----------------------------------------------------------------------
Comment By: DJSasso (djsasso)
Date: 2010-10-07 12:33
Message:
It is doing it for me as well. Has been for the last few days, but seeing
as other bot seemed to fix it immediately I didn`t think it was a big issue
or was maybe my machine. So I was trying to figure it out on my own. But if
its happening to others its clearly not just my machine.
----------------------------------------------------------------------
Comment By: xqt (xqt)
Date: 2010-10-05 15:17
Message:
I found this bug this morning but now it works as expected.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=3081100&group_…