jenkins-bot merged this change.

View Change

Approvals: Xqt: Looks good to me, approved jenkins-bot: Verified
textlib.py: Avoid zero-width matching groups

This is a little trick to circumvent https://bugs.python.org/issue12177 .
The Memory error of re in Python 2.7.2 and 2.7.3 has something to do with
zero-width matching groups.

Here, by using + instead of * in other_chars group we avoid a zero-width match
and to make the group optional again we make the whole positive lookahead and
its group optional.

Bug: T191161
Change-Id: Ibfc8b8f961bdb13284aa5592fd9b7597e47f9d97
---
M pywikibot/textlib.py
1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/pywikibot/textlib.py b/pywikibot/textlib.py
index f959673..694e781 100644
--- a/pywikibot/textlib.py
+++ b/pywikibot/textlib.py
@@ -107,9 +107,9 @@
\[\[\s*
(?:%s) # namespace aliases
\s*:
- (?=(?P<filename>
- [^]|]*
- ))(?P=filename)
+ ((?=(?P<filename>
+ [^]|]+ # * quantifier may crash on Python 2.7.2 (T191161)
+ ))(?P=filename))?
(
\|
(
@@ -118,9 +118,9 @@
\[\[.*?\]\]
))(?P=inner_link)
)?
- (?=(?P<other_chars>
- [^\[\]]*
- ))(?P=other_chars)
+ ((?=(?P<other_chars>
+ [^\[\]]+ # * quantifier may crash on Python 2.7.2 (T191161)
+ ))(?P=other_chars))?
|
(?=(?P<not_wikilink>
\[[^]]*\]

To view, visit change 423451. To unsubscribe, visit settings.

Gerrit-Project: pywikibot/core
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ibfc8b8f961bdb13284aa5592fd9b7597e47f9d97
Gerrit-Change-Number: 423451
Gerrit-PatchSet: 2
Gerrit-Owner: Dalba <dalba.wiki@gmail.com>
Gerrit-Reviewer: Dalba <dalba.wiki@gmail.com>
Gerrit-Reviewer: John Vandenberg <jayvdb@gmail.com>
Gerrit-Reviewer: Xqt <info@gno.de>
Gerrit-Reviewer: Zoranzoki21 <zorandori4444@gmail.com>
Gerrit-Reviewer: jenkins-bot <>