Bugs item #3158761, was opened at 2011-01-15 10:17
Message generated for change (Comment added) made by binbot
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=315876…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Bináris (binbot)
Assigned to: Nobody/Anonymous (nobody)
Summary: Template exception overworks in replace.py
Initial Comment:
I correct spelling mistakes with replace.py, and use exception:
'exceptions': {
'inside-tags': [
'hyperlink',
'template',
],
etc. as shown at
http://meta.wikimedia.org/wiki/Pywikipediabot/replace.py/it
This exception excludes a lot of text that should be replaced! After a long investigation
I suspect that the problem may exist when the template is complicated, e. g. the article
begins with an infobox. The bot probably thinks to be inside of the template when it is
already closed.
Examples:
In the last sentence of section
http://hu.wikipedia.org/w/index.php?title=Nagyv%C3%A1rad&oldid=9085449#…
the word "telepitettek" was not found. The article begins with an infobox.
In the middle of section
http://hu.wikipedia.org/w/index.php?title=Opera_%28sz%C3%ADnm%C5%B1%29&…
the word "Szenitávnéji" was not found. The article has no infobox, but the text
is preceeded by some templates with parameters, one of them at the very beginning.
In section
http://hu.wikipedia.org/w/index.php?title=Tennessee&oldid=9028125#Megy.… the word
"alapitási" was not found. The article begins with an infobox.
But:
The bot made the replacement here:
http://hu.wikipedia.org/w/index.php?title=Mozilla&diff=9106942&oldi…
This is also preceeded by some templates, which have parameters, but the one at the
beginning of the article has no parameters. Does this make the difference?
All the above mentioned instances were found by the bot when I commented the word
"template" out of the exceptions.
Not clear whether the bug is in replace.py or pagegenerators.
----------------------------------------------------------------------
Comment By: Bináris (binbot)
Date: 2011-01-15
23:36
Message:
Hurray, I have caught it! The bugfix is easy. In pywikibot/textlib.py, line
83, the outer brace is greedy. Changing
'template': re.compile(r'(?s){{(({{.*?}})|.)*}}'),
to
'template': re.compile(r'(?s){{(({{.*?}})|.)*?}}'),
solved the problem for me.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=315876…