Bugs item #3174600, was opened at 2011-02-07 02:10
Message generated for change (Comment added) made by xqt
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=3174600&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: solve_disambiguation
Group: None
>Status: Closed
>Resolution: Fixed
Priority: 5
Private: No
Submitted By: dmitrynikitin ()
Assigned to: Nobody/Anonymous (nobody)
Summary: solve_disambiguation.py Syntax Error
Initial Comment:
Pywikipedia [http] trunk/pywikipedia (r8927, 2011/02/06, 05:07:33)
Python 2.6.5 (r265:79063, Apr 16 2010, 13:57:41)
[GCC 4.4.3]
config-settings:
use_api = True
use_api_login = True
unicode test: ok
-------------------------------------------
solve_disambiguation.py -help
File "solve_disambiguation.py", line 264
'fr': u'Homonymie résolue à l’aide du robot : %s - marquée comme demandant l'attention d'un expert',
^
SyntaxError: invalid syntax
----------------------------------------------------------------------
>Comment By: xqt (xqt)
Date: 2011-02-17 02:26
Message:
fixed in r8950 by amir
----------------------------------------------------------------------
Comment By: Bináris (binbot)
Date: 2011-02-09 10:55
Message:
While the first apostrophe in "l’aide" differs from the apostrophe that
delimiters the string, two others in "l'attention d'un" are identical to
it. Depending on French spelling, either they should be changed to ’ or
preceeded by an escape mark (\') or the delimiters should be changed to
double quote.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=3174600&group_…
Bugs item #3172101, was opened at 2011-02-04 00:03
Message generated for change (Comment added) made by xqt
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=3172101&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: category
Group: None
>Status: Closed
>Resolution: Wont Fix
Priority: 5
Private: No
Submitted By: Hercule (herculefr)
>Assigned to: xqt (xqt)
Summary: add french translation on category.py
Initial Comment:
for msg_replace :
'fr':u'Robot : Remplace %(oldcat)s par %(newcat)s',
for listify_msg:
'fr': u'Robot: Listage de %(fromcat)s (%(num)d éléments)',
for deletion_reason_remove
'fr': u'Robot: La catégorie a été supprimée',
thanks
----------------------------------------------------------------------
>Comment By: xqt (xqt)
Date: 2011-02-17 02:18
Message:
for category.py i18n support has been ported to translatewiki.net
Please use the tools there for localizing translations.
BTW: such messages where ported to the bots framework.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=3172101&group_…
Bugs item #3183359, was opened at 2011-02-16 08:31
Message generated for change (Comment added) made by xqt
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=3183359&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: interwiki
Group: None
>Status: Closed
>Resolution: Fixed
Priority: 5
Private: No
Submitted By: JAn (jandudik)
>Assigned to: xqt (xqt)
Summary: minor bug, but fatal
Initial Comment:
Missing comma, see patch
----------------------------------------------------------------------
>Comment By: xqt (xqt)
Date: 2011-02-17 02:07
Message:
I guess this was introduced with r8959
and is fixed with r8967
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=3183359&group_…
Bugs item #3175720, was opened at 2011-02-08 16:09
Message generated for change (Comment added) made by xqt
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=3175720&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: interwiki
Group: None
>Status: Closed
>Resolution: Fixed
Priority: 8
Private: No
Submitted By: JAn (jandudik)
>Assigned to: xqt (xqt)
Summary: Some languages not recognized
Initial Comment:
In actual version (8938) are some languages not recognized as languge code. interwikis are moved to the top of page and language is not updated.
See
http://cs.wikipedia.org/w/index.php?title=Brusel&diff=prev&oldid=6503926
I know about gd, ku, zh-yue and tt now
----------------------------------------------------------------------
>Comment By: xqt (xqt)
Date: 2011-02-17 02:03
Message:
bug was intoduced in r8926.
fixed in r8943.
----------------------------------------------------------------------
Comment By: betacommand (betacommand)
Date: 2011-02-08 18:24
Message:
Can you please post the results of version.py ?
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=3175720&group_…
Bugs item #3182761, was opened at 2011-02-15 21:40
Message generated for change (Comment added) made by valhallasw
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=3182761&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Resolution: None
Priority: 7
Private: No
Submitted By: GanZ (ganz-ru)
Assigned to: Nobody/Anonymous (nobody)
Summary: Problem with Tibetan script
Initial Comment:
Here is hard edit war: http://en.wikipedia.org/w/index.php?title=Podolsk&action=history . Bots with the old python version add incorrect tibetan interwiki. And bot with version 2.7.1 do it correctly.
----------------------------------------------------------------------
>Comment By: Merlijn S. van Deen (valhallasw)
Date: 2011-02-16 11:19
Message:
Wikimedia bugzilla bug entry:
https://bugzilla.wikimedia.org/show_bug.cgi?id=27446
----------------------------------------------------------------------
Comment By: Merlijn S. van Deen (valhallasw)
Date: 2011-02-16 09:28
Message:
Aaand http://bugs.python.org/issue10567 is related to that.
In essence: bots running < 2.7 were technically doing the wrong thing, but
this did not go noticed as no-one used the interwiki to the tibetan
wikipedia, and all bots did the same wrong thing. Now there are bots
running 2.7+, from the toolserver, and the bug surfaced.
----------------------------------------------------------------------
Comment By: Merlijn S. van Deen (valhallasw)
Date: 2011-02-16 09:21
Message:
JAn Dudik moved the page, so the problem should be fixed for now. Keeping
this open (it's a bug in pywikipedia, after all).
Related:
Python 2.6.5 (r265:79063, Apr 16 2010, 13:09:56)
[GCC 4.4.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> u'\u200b'.strip()
u''
Python 2.7.1 (r271:86832, Jan 4 2011, 13:57:14)
[GCC 4.5.2] on sunos5
Type "help", "copyright", "credits" or "license" for more information.
>>> u'\u200b'.strip()
u'\u200b'
\u200b is technically not whitespace, so strip() probably should not
delete it.
Of course, pwb should not be stripping page titles in the first place.
----------------------------------------------------------------------
Comment By: GanZ (ganz-ru)
Date: 2011-02-16 01:46
Message:
I agree that this symbol in titles is absolutely useless. Not only for
bots, but for usual users too since it can break their copy-paste
operations.
If you can start discussion on mediawiki-tech, please do it.
Thank you.
----------------------------------------------------------------------
Comment By: Merlijn S. van Deen (valhallasw)
Date: 2011-02-16 01:34
Message:
Stripping is done in xmlreader.py:194. Calling strip() seems to remove the
U+200B character indeed.
----------------------------------------------------------------------
Comment By: Merlijn S. van Deen (valhallasw)
Date: 2011-02-16 01:31
Message:
Except I don't have move privileges on that wiki. Added a comment on the
user talk page of the guy who created the page.
----------------------------------------------------------------------
Comment By: Merlijn S. van Deen (valhallasw)
Date: 2011-02-16 01:10
Message:
Ok. There are two issues playing a role here.
1) the 'correct' page name ends in a 0x200B ZERO WIDTH SPACE. This makes
no sense, other than to annoy people.
2) the XML parser strips spaces around titles, including the 0x200B ZERO
WIDTH SPACE.
3) Mediawiki does *not* do this
So, first of all, I will rename the article, so it no longer has the
0x200B ZERO WIDTH SPACE in the title. I will see if I can pinpoint the XML
bug, so someone else may fix it. However, due to the fact bots are killing
eachother about it, I suspect this is a small change somewhere in the
python APIs - or default setting that changed.
Lastly, maybe we should broaden the discussion into mediawiki-tech --
should page titles be allowed to have unicode whitespace characters
embedded, especially if they are invisible?
----------------------------------------------------------------------
Comment By: GanZ (ganz-ru)
Date: 2011-02-16 00:28
Message:
Version.py:
Pywikipedia [http] trunk/pywikipedia (r8948, 2011/02/13, 09:19:56)
Python 2.6.4 (r264:75708, Oct 26 2009, 08:23:19) [MSC v.1500 32 bit
(Intel)]
config-settings:
use_api = True
use_api_login = True
unicode test: ok
I'll be glad to help if you write the test code.
----------------------------------------------------------------------
Comment By: Merlijn S. van Deen (valhallasw)
Date: 2011-02-16 00:17
Message:
I meant the output if you type those lines into the python interpreter -
but I've been poking around some more. It does not seem to be JSON related
- or maybe it is, or maybe it isn't it. I think it has to do with some very
old code called 'getall', which gets batches of pages.
Sigh. I would very much like to say: "bad luck, try the rewrite" - I'm
almost afraid to touch that piece of code. I'll see if I can whip up a test
you can run, though, to confirm my suspicions.
In the meanwhile, could you post the output of version.py?
Thanks.
----------------------------------------------------------------------
Comment By: GanZ (ganz-ru)
Date: 2011-02-16 00:13
Message:
If I did it right output is 4 files:
__init__.pyc
decoder.pyc
encoder.pyc
scanner.pyc
Are you needing them?
I'm sorry, I'm not the python programmer.
----------------------------------------------------------------------
Comment By: Merlijn S. van Deen (valhallasw)
Date: 2011-02-15 23:03
Message:
I see, indeed.
Could you post the output of
import query
print query.json.__file__
for your bot? I cannot reproduce the bug, so I suspect it might be due to
a buggy json package. I'll do some package sniffing to check this further.
----------------------------------------------------------------------
Comment By: GanZ (ganz-ru)
Date: 2011-02-15 22:22
Message:
I'sorry. All of them have versions older than 2.6.5.
----------------------------------------------------------------------
Comment By: GanZ (ganz-ru)
Date: 2011-02-15 22:20
Message:
Same problem have many other bots: my bot, LucienBOT, VolkovBot. All of
them have versions newer than 2.6.5.
----------------------------------------------------------------------
Comment By: Merlijn S. van Deen (valhallasw)
Date: 2011-02-15 22:03
Message:
My crystal ball suggests:
* Wikitanvirbot is running from the toolserver, which has a patched 2.7.1
without unicode bug
* TXiKiBoT is running an old version of pywikipediabot (there is no
python version in the edit summary) on python 2.6.5+
Conclusion: TXiKiBot should be blocked until its owner fixes his/hers
setup.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=3182761&group_…
Bugs item #3182761, was opened at 2011-02-15 21:40
Message generated for change (Comment added) made by valhallasw
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=3182761&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Resolution: None
Priority: 7
Private: No
Submitted By: GanZ (ganz-ru)
Assigned to: Nobody/Anonymous (nobody)
Summary: Problem with Tibetan script
Initial Comment:
Here is hard edit war: http://en.wikipedia.org/w/index.php?title=Podolsk&action=history . Bots with the old python version add incorrect tibetan interwiki. And bot with version 2.7.1 do it correctly.
----------------------------------------------------------------------
>Comment By: Merlijn S. van Deen (valhallasw)
Date: 2011-02-16 09:28
Message:
Aaand http://bugs.python.org/issue10567 is related to that.
In essence: bots running < 2.7 were technically doing the wrong thing, but
this did not go noticed as no-one used the interwiki to the tibetan
wikipedia, and all bots did the same wrong thing. Now there are bots
running 2.7+, from the toolserver, and the bug surfaced.
----------------------------------------------------------------------
Comment By: Merlijn S. van Deen (valhallasw)
Date: 2011-02-16 09:21
Message:
JAn Dudik moved the page, so the problem should be fixed for now. Keeping
this open (it's a bug in pywikipedia, after all).
Related:
Python 2.6.5 (r265:79063, Apr 16 2010, 13:09:56)
[GCC 4.4.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> u'\u200b'.strip()
u''
Python 2.7.1 (r271:86832, Jan 4 2011, 13:57:14)
[GCC 4.5.2] on sunos5
Type "help", "copyright", "credits" or "license" for more information.
>>> u'\u200b'.strip()
u'\u200b'
\u200b is technically not whitespace, so strip() probably should not
delete it.
Of course, pwb should not be stripping page titles in the first place.
----------------------------------------------------------------------
Comment By: GanZ (ganz-ru)
Date: 2011-02-16 01:46
Message:
I agree that this symbol in titles is absolutely useless. Not only for
bots, but for usual users too since it can break their copy-paste
operations.
If you can start discussion on mediawiki-tech, please do it.
Thank you.
----------------------------------------------------------------------
Comment By: Merlijn S. van Deen (valhallasw)
Date: 2011-02-16 01:34
Message:
Stripping is done in xmlreader.py:194. Calling strip() seems to remove the
U+200B character indeed.
----------------------------------------------------------------------
Comment By: Merlijn S. van Deen (valhallasw)
Date: 2011-02-16 01:31
Message:
Except I don't have move privileges on that wiki. Added a comment on the
user talk page of the guy who created the page.
----------------------------------------------------------------------
Comment By: Merlijn S. van Deen (valhallasw)
Date: 2011-02-16 01:10
Message:
Ok. There are two issues playing a role here.
1) the 'correct' page name ends in a 0x200B ZERO WIDTH SPACE. This makes
no sense, other than to annoy people.
2) the XML parser strips spaces around titles, including the 0x200B ZERO
WIDTH SPACE.
3) Mediawiki does *not* do this
So, first of all, I will rename the article, so it no longer has the
0x200B ZERO WIDTH SPACE in the title. I will see if I can pinpoint the XML
bug, so someone else may fix it. However, due to the fact bots are killing
eachother about it, I suspect this is a small change somewhere in the
python APIs - or default setting that changed.
Lastly, maybe we should broaden the discussion into mediawiki-tech --
should page titles be allowed to have unicode whitespace characters
embedded, especially if they are invisible?
----------------------------------------------------------------------
Comment By: GanZ (ganz-ru)
Date: 2011-02-16 00:28
Message:
Version.py:
Pywikipedia [http] trunk/pywikipedia (r8948, 2011/02/13, 09:19:56)
Python 2.6.4 (r264:75708, Oct 26 2009, 08:23:19) [MSC v.1500 32 bit
(Intel)]
config-settings:
use_api = True
use_api_login = True
unicode test: ok
I'll be glad to help if you write the test code.
----------------------------------------------------------------------
Comment By: Merlijn S. van Deen (valhallasw)
Date: 2011-02-16 00:17
Message:
I meant the output if you type those lines into the python interpreter -
but I've been poking around some more. It does not seem to be JSON related
- or maybe it is, or maybe it isn't it. I think it has to do with some very
old code called 'getall', which gets batches of pages.
Sigh. I would very much like to say: "bad luck, try the rewrite" - I'm
almost afraid to touch that piece of code. I'll see if I can whip up a test
you can run, though, to confirm my suspicions.
In the meanwhile, could you post the output of version.py?
Thanks.
----------------------------------------------------------------------
Comment By: GanZ (ganz-ru)
Date: 2011-02-16 00:13
Message:
If I did it right output is 4 files:
__init__.pyc
decoder.pyc
encoder.pyc
scanner.pyc
Are you needing them?
I'm sorry, I'm not the python programmer.
----------------------------------------------------------------------
Comment By: Merlijn S. van Deen (valhallasw)
Date: 2011-02-15 23:03
Message:
I see, indeed.
Could you post the output of
import query
print query.json.__file__
for your bot? I cannot reproduce the bug, so I suspect it might be due to
a buggy json package. I'll do some package sniffing to check this further.
----------------------------------------------------------------------
Comment By: GanZ (ganz-ru)
Date: 2011-02-15 22:22
Message:
I'sorry. All of them have versions older than 2.6.5.
----------------------------------------------------------------------
Comment By: GanZ (ganz-ru)
Date: 2011-02-15 22:20
Message:
Same problem have many other bots: my bot, LucienBOT, VolkovBot. All of
them have versions newer than 2.6.5.
----------------------------------------------------------------------
Comment By: Merlijn S. van Deen (valhallasw)
Date: 2011-02-15 22:03
Message:
My crystal ball suggests:
* Wikitanvirbot is running from the toolserver, which has a patched 2.7.1
without unicode bug
* TXiKiBoT is running an old version of pywikipediabot (there is no
python version in the edit summary) on python 2.6.5+
Conclusion: TXiKiBot should be blocked until its owner fixes his/hers
setup.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=3182761&group_…
Bugs item #3182761, was opened at 2011-02-15 21:40
Message generated for change (Comment added) made by valhallasw
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=3182761&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Resolution: None
Priority: 7
Private: No
Submitted By: GanZ (ganz-ru)
Assigned to: Nobody/Anonymous (nobody)
Summary: Problem with Tibetan script
Initial Comment:
Here is hard edit war: http://en.wikipedia.org/w/index.php?title=Podolsk&action=history . Bots with the old python version add incorrect tibetan interwiki. And bot with version 2.7.1 do it correctly.
----------------------------------------------------------------------
>Comment By: Merlijn S. van Deen (valhallasw)
Date: 2011-02-16 09:21
Message:
JAn Dudik moved the page, so the problem should be fixed for now. Keeping
this open (it's a bug in pywikipedia, after all).
Related:
Python 2.6.5 (r265:79063, Apr 16 2010, 13:09:56)
[GCC 4.4.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> u'\u200b'.strip()
u''
Python 2.7.1 (r271:86832, Jan 4 2011, 13:57:14)
[GCC 4.5.2] on sunos5
Type "help", "copyright", "credits" or "license" for more information.
>>> u'\u200b'.strip()
u'\u200b'
\u200b is technically not whitespace, so strip() probably should not
delete it.
Of course, pwb should not be stripping page titles in the first place.
----------------------------------------------------------------------
Comment By: GanZ (ganz-ru)
Date: 2011-02-16 01:46
Message:
I agree that this symbol in titles is absolutely useless. Not only for
bots, but for usual users too since it can break their copy-paste
operations.
If you can start discussion on mediawiki-tech, please do it.
Thank you.
----------------------------------------------------------------------
Comment By: Merlijn S. van Deen (valhallasw)
Date: 2011-02-16 01:34
Message:
Stripping is done in xmlreader.py:194. Calling strip() seems to remove the
U+200B character indeed.
----------------------------------------------------------------------
Comment By: Merlijn S. van Deen (valhallasw)
Date: 2011-02-16 01:31
Message:
Except I don't have move privileges on that wiki. Added a comment on the
user talk page of the guy who created the page.
----------------------------------------------------------------------
Comment By: Merlijn S. van Deen (valhallasw)
Date: 2011-02-16 01:10
Message:
Ok. There are two issues playing a role here.
1) the 'correct' page name ends in a 0x200B ZERO WIDTH SPACE. This makes
no sense, other than to annoy people.
2) the XML parser strips spaces around titles, including the 0x200B ZERO
WIDTH SPACE.
3) Mediawiki does *not* do this
So, first of all, I will rename the article, so it no longer has the
0x200B ZERO WIDTH SPACE in the title. I will see if I can pinpoint the XML
bug, so someone else may fix it. However, due to the fact bots are killing
eachother about it, I suspect this is a small change somewhere in the
python APIs - or default setting that changed.
Lastly, maybe we should broaden the discussion into mediawiki-tech --
should page titles be allowed to have unicode whitespace characters
embedded, especially if they are invisible?
----------------------------------------------------------------------
Comment By: GanZ (ganz-ru)
Date: 2011-02-16 00:28
Message:
Version.py:
Pywikipedia [http] trunk/pywikipedia (r8948, 2011/02/13, 09:19:56)
Python 2.6.4 (r264:75708, Oct 26 2009, 08:23:19) [MSC v.1500 32 bit
(Intel)]
config-settings:
use_api = True
use_api_login = True
unicode test: ok
I'll be glad to help if you write the test code.
----------------------------------------------------------------------
Comment By: Merlijn S. van Deen (valhallasw)
Date: 2011-02-16 00:17
Message:
I meant the output if you type those lines into the python interpreter -
but I've been poking around some more. It does not seem to be JSON related
- or maybe it is, or maybe it isn't it. I think it has to do with some very
old code called 'getall', which gets batches of pages.
Sigh. I would very much like to say: "bad luck, try the rewrite" - I'm
almost afraid to touch that piece of code. I'll see if I can whip up a test
you can run, though, to confirm my suspicions.
In the meanwhile, could you post the output of version.py?
Thanks.
----------------------------------------------------------------------
Comment By: GanZ (ganz-ru)
Date: 2011-02-16 00:13
Message:
If I did it right output is 4 files:
__init__.pyc
decoder.pyc
encoder.pyc
scanner.pyc
Are you needing them?
I'm sorry, I'm not the python programmer.
----------------------------------------------------------------------
Comment By: Merlijn S. van Deen (valhallasw)
Date: 2011-02-15 23:03
Message:
I see, indeed.
Could you post the output of
import query
print query.json.__file__
for your bot? I cannot reproduce the bug, so I suspect it might be due to
a buggy json package. I'll do some package sniffing to check this further.
----------------------------------------------------------------------
Comment By: GanZ (ganz-ru)
Date: 2011-02-15 22:22
Message:
I'sorry. All of them have versions older than 2.6.5.
----------------------------------------------------------------------
Comment By: GanZ (ganz-ru)
Date: 2011-02-15 22:20
Message:
Same problem have many other bots: my bot, LucienBOT, VolkovBot. All of
them have versions newer than 2.6.5.
----------------------------------------------------------------------
Comment By: Merlijn S. van Deen (valhallasw)
Date: 2011-02-15 22:03
Message:
My crystal ball suggests:
* Wikitanvirbot is running from the toolserver, which has a patched 2.7.1
without unicode bug
* TXiKiBoT is running an old version of pywikipediabot (there is no
python version in the edit summary) on python 2.6.5+
Conclusion: TXiKiBot should be blocked until its owner fixes his/hers
setup.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=3182761&group_…
Bugs item #3183359, was opened at 2011-02-16 08:31
Message generated for change (Tracker Item Submitted) made by jandudik
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=3183359&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: interwiki
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: JAn (jandudik)
Assigned to: Nobody/Anonymous (nobody)
Summary: minor bug, but fatal
Initial Comment:
Missing comma, see patch
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=3183359&group_…
Patches item #3183321, was opened at 2011-02-15 22:30
Message generated for change (Tracker Item Submitted) made by cooties
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=3183321&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Translations
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Alison Cassidy (cooties)
Assigned to: Nobody/Anonymous (nobody)
Summary: ga localization for cosmetic_changes.py
Initial Comment:
Hi there,
This is just a translation patch for the cosmetic_changes.py script, as used on the Irish language Wikipedia (ga.wikipedia.org)
Pywikipedia [http] trunk/pywikipedia (r8950, 2011/02/14, 02:08:30)
Python 2.6.1 (r261:67515, Jun 24 2010, 21:47:49)
[GCC 4.2.1 (Apple Inc. build 5646)]
config-settings:
use_api = True
use_api_login = True
unicode test: ok
Thanks! :)
-- Allie (User:Alison, ga.wikipedia sysop/bureaucrat)
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=3183321&group_…
Bugs item #3182761, was opened at 2011-02-15 23:40
Message generated for change (Comment added) made by ganz-ru
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=3182761&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Resolution: None
Priority: 7
Private: No
Submitted By: GanZ (ganz-ru)
Assigned to: Nobody/Anonymous (nobody)
Summary: Problem with Tibetan script
Initial Comment:
Here is hard edit war: http://en.wikipedia.org/w/index.php?title=Podolsk&action=history . Bots with the old python version add incorrect tibetan interwiki. And bot with version 2.7.1 do it correctly.
----------------------------------------------------------------------
>Comment By: GanZ (ganz-ru)
Date: 2011-02-16 03:46
Message:
I agree that this symbol in titles is absolutely useless. Not only for
bots, but for usual users too since it can break their copy-paste
operations.
If you can start discussion on mediawiki-tech, please do it.
Thank you.
----------------------------------------------------------------------
Comment By: Merlijn S. van Deen (valhallasw)
Date: 2011-02-16 03:34
Message:
Stripping is done in xmlreader.py:194. Calling strip() seems to remove the
U+200B character indeed.
----------------------------------------------------------------------
Comment By: Merlijn S. van Deen (valhallasw)
Date: 2011-02-16 03:31
Message:
Except I don't have move privileges on that wiki. Added a comment on the
user talk page of the guy who created the page.
----------------------------------------------------------------------
Comment By: Merlijn S. van Deen (valhallasw)
Date: 2011-02-16 03:10
Message:
Ok. There are two issues playing a role here.
1) the 'correct' page name ends in a 0x200B ZERO WIDTH SPACE. This makes
no sense, other than to annoy people.
2) the XML parser strips spaces around titles, including the 0x200B ZERO
WIDTH SPACE.
3) Mediawiki does *not* do this
So, first of all, I will rename the article, so it no longer has the
0x200B ZERO WIDTH SPACE in the title. I will see if I can pinpoint the XML
bug, so someone else may fix it. However, due to the fact bots are killing
eachother about it, I suspect this is a small change somewhere in the
python APIs - or default setting that changed.
Lastly, maybe we should broaden the discussion into mediawiki-tech --
should page titles be allowed to have unicode whitespace characters
embedded, especially if they are invisible?
----------------------------------------------------------------------
Comment By: GanZ (ganz-ru)
Date: 2011-02-16 02:28
Message:
Version.py:
Pywikipedia [http] trunk/pywikipedia (r8948, 2011/02/13, 09:19:56)
Python 2.6.4 (r264:75708, Oct 26 2009, 08:23:19) [MSC v.1500 32 bit
(Intel)]
config-settings:
use_api = True
use_api_login = True
unicode test: ok
I'll be glad to help if you write the test code.
----------------------------------------------------------------------
Comment By: Merlijn S. van Deen (valhallasw)
Date: 2011-02-16 02:17
Message:
I meant the output if you type those lines into the python interpreter -
but I've been poking around some more. It does not seem to be JSON related
- or maybe it is, or maybe it isn't it. I think it has to do with some very
old code called 'getall', which gets batches of pages.
Sigh. I would very much like to say: "bad luck, try the rewrite" - I'm
almost afraid to touch that piece of code. I'll see if I can whip up a test
you can run, though, to confirm my suspicions.
In the meanwhile, could you post the output of version.py?
Thanks.
----------------------------------------------------------------------
Comment By: GanZ (ganz-ru)
Date: 2011-02-16 02:13
Message:
If I did it right output is 4 files:
__init__.pyc
decoder.pyc
encoder.pyc
scanner.pyc
Are you needing them?
I'm sorry, I'm not the python programmer.
----------------------------------------------------------------------
Comment By: Merlijn S. van Deen (valhallasw)
Date: 2011-02-16 01:03
Message:
I see, indeed.
Could you post the output of
import query
print query.json.__file__
for your bot? I cannot reproduce the bug, so I suspect it might be due to
a buggy json package. I'll do some package sniffing to check this further.
----------------------------------------------------------------------
Comment By: GanZ (ganz-ru)
Date: 2011-02-16 00:22
Message:
I'sorry. All of them have versions older than 2.6.5.
----------------------------------------------------------------------
Comment By: GanZ (ganz-ru)
Date: 2011-02-16 00:20
Message:
Same problem have many other bots: my bot, LucienBOT, VolkovBot. All of
them have versions newer than 2.6.5.
----------------------------------------------------------------------
Comment By: Merlijn S. van Deen (valhallasw)
Date: 2011-02-16 00:03
Message:
My crystal ball suggests:
* Wikitanvirbot is running from the toolserver, which has a patched 2.7.1
without unicode bug
* TXiKiBoT is running an old version of pywikipediabot (there is no
python version in the edit summary) on python 2.6.5+
Conclusion: TXiKiBot should be blocked until its owner fixes his/hers
setup.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=3182761&group_…