Bugs item #3539444, was opened at 2012-07-02 06:25 Message generated for change (Comment added) made by xqt You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603138&aid=3539444...
Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: General Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Nobody/Anonymous (nobody) Assigned to: Nobody/Anonymous (nobody) Summary: replace doesn't support optional groups
Initial Comment: textlib.py (method replaceExcept) doesn't support optional capturing groups in regex.
I tried to run replace.py with the following regex: "RISHMI(T |IM)?" => "RISHMI\1" when running it on a page containing the following text "SOMETHING RISHMI SOMETHING" it crashes with the following error: textlib.py, line 178, in replaceExcept match.group(groupID) + \ TypeError: coercing to Unicode: need string or buffer, NoneType found
line 178 contains the statement: replacement = replacement[:groupMatch.start()] + \ match.group(groupID) + \ replacement[groupMatch.end():]
textlib.py should check for match.group(groupID) ==None and if so, add here empty string instead of match.group(groupID)
----------------------------------------------------------------------
Comment By: xqt (xqt)
Date: 2012-07-03 22:20
Message: The group must exist to reuse it. What should this regex do in your opinion. What about RISHMI(T |IM|)" or RISHM((?:T |IM)?)"? Errors should never pass silently unless explicitly silenced (PEP 20). Maybe replacing empty strings could lead to unwanted side effects but I have'nt thought about it.
----------------------------------------------------------------------
You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603138&aid=3539444...
pywikipedia-bugs@lists.wikimedia.org