jenkins-bot merged this change.

View Change

Approvals: Xqt: Looks good to me, approved jenkins-bot: Verified
pywikibot/textlib.py: Fix header regex to allow comments

This change fixes a bug in header regex that made it not detect headers
which contained a comment after the header or a comment containing a \n
within the header.

Related bug report on the mailing list:
https://lists.wikimedia.org/pipermail/pywikibot/2018-August/009874.html

Change-Id: Id21200e596b6689c7bed35d0865cad5504be1676
---
M pywikibot/textlib.py
M tests/cosmetic_changes_tests.py
2 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/pywikibot/textlib.py b/pywikibot/textlib.py
index e6ee2c1..700d82f 100644
--- a/pywikibot/textlib.py
+++ b/pywikibot/textlib.py
@@ -272,7 +272,10 @@
# files
'file': (FILE_LINK_REGEX, lambda site: '|'.join(site.namespaces[6])),
# section headers
- 'header': re.compile(r'(?:(?<=\n)|\A)=+.+=+ *(?=\n|\Z)'),
+ 'header': re.compile(
+ r'(?:(?<=\n)|\A)'
+ r'=(?:.|<!--[\s\S]*?-->)+='
+ r' *(?:<!--[\s\S]*?--> *)*(?=\n|\Z)'),
# external links
'hyperlink': compileLinkR(),
# also finds links to foreign sites with preleading ":"
diff --git a/tests/cosmetic_changes_tests.py b/tests/cosmetic_changes_tests.py
index b1c6361..d5354c0 100644
--- a/tests/cosmetic_changes_tests.py
+++ b/tests/cosmetic_changes_tests.py
@@ -332,6 +332,18 @@
self.cct.removeEmptySections('\n==Bar==\n[[cs:Foo]]'
'\n[[Category:Baz]]'))

+ def test_remove_empty_sections_with_heading_comments(self):
+ """Test removeEmptySections with comments in the section headings."""
+ self.assertEqual(
+ '==2==<!--c--> <!--\n-->\nt',
+ self.cct.removeEmptySections('==1==\n==2==<!--c--> <!--\n-->\nt'))
+ self.assertEqual(
+ '==2== <!--c-->\nt',
+ self.cct.removeEmptySections('==1==\n==2== <!--c-->\nt'))
+ self.assertEqual(
+ '==2<!--\n-->==\nt',
+ self.cct.removeEmptySections('==1==\n==2<!--\n-->==\nt'))
+
def test_translateAndCapitalizeNamespaces(self):
"""Test translateAndCapitalizeNamespaces method."""
self.assertEqual(

To view, visit change 453302. To unsubscribe, or for help writing mail filters, visit settings.

Gerrit-Project: pywikibot/core
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Id21200e596b6689c7bed35d0865cad5504be1676
Gerrit-Change-Number: 453302
Gerrit-PatchSet: 6
Gerrit-Owner: Dalba <dalba.wiki@gmail.com>
Gerrit-Reviewer: John Vandenberg <jayvdb@gmail.com>
Gerrit-Reviewer: Xqt <info@gno.de>
Gerrit-Reviewer: Zoranzoki21 <zorandori4444@gmail.com>
Gerrit-Reviewer: jenkins-bot (75)