Patches item #3539859, was opened at 2012-07-03 11:35
Message generated for change (Comment added) made by xqt
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=353985…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Eranroz (eranroz)
Assigned to: Nobody/Anonymous (nobody)
Summary: Bugfix for optional caputring group
Initial Comment:
Patch for pywikibot/textlib.py for the replace function (replaceExcept) for supporting for
empty/optional capturing groups.
This is a bugfix for a crash that occur when using replace.py with a regex containing
optional capturing group (eg AAA in this regex "bla(AAA)?bla" )
----------------------------------------------------------------------
Comment By: xqt (xqt)
Date: 2012-07-04 04:02
Message:
See my comment at the corresponding bug tracker. Maybe it would be ok to
accept this patch, anyway I've asked for a third opinion in this matter.
----------------------------------------------------------------------
Comment By: Eranroz (eranroz)
Date: 2012-07-03 23:44
Message:
Yea, this is bugfix for 3539444 .
In short:
when running the following regex "ADMA (a)?poria" => "ADMA
\1porya"
on text containing ADMA poria (with no a before poria) it crashs with the
following error
doReplacements
res = replace.ReplaceRobot.doReplacements(self,original_text)
File "D:\myBot\python\pywikipedia-nightly\replace.py", line 390, in
doReplacements
allowoverlap=self.allowoverlap)
File "D:\myBot\python\pywikipedia-nightly\pywikibot\textlib.py", line
179, in replaceExcept
match.group(groupID) + \
TypeError: coercing to Unicode: need string or buffer, NoneType found
You may suggest to rewrite the specific regex and it may probably work, but
it is just workaround - regex with optional capturing group is correct and
should work properly.
Longer story :) :
In Hebrew Wikipedia there is a list of regexs that are used for
replacements in all articles (almost). which is here:
http://he.wikipedia.org/wiki/%D7%95%D7%A7:%D7%A8%D7%94
The columns in the table there are:
ID | old | new | exceptText
The list is used by C# bot implementation which isn't active, and by JS
userscript implementation which is used for specific page replacements.
I have ported it to work with replace.py, but if fails when it gets to
replacement with optional capturing group. After my fix (locally) I ran it
for 250 test edits and it worked properly without crashes
----------------------------------------------------------------------
Comment By: xqt (xqt)
Date: 2012-07-03 21:48
Message:
Is this path for bug #3539444?
----------------------------------------------------------------------
Comment By: xqt (xqt)
Date: 2012-07-03 21:26
Message:
I don't understand this bug. What is the traceback before this patch is
implemented. And what should that replaceexcept() do in your special case
Could you give me a full example. You may exclude this group by
"bla(?:AAA)?bla"; would this help?
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=353985…