jenkins-bot has submitted this change. (
https://gerrit.wikimedia.org/r/c/pywikibot/core/+/863023 )
Change subject: [bugfix]: unquote title for red-links in Index"
......................................................................
[bugfix]: unquote title for red-links in Index"
When getting titles from Index page HTML code, characters
can be url-encoded (e.g. when an "'" is present in the title page).
Unquote to obtain the correct page title.
Change-Id: If1d6dfc0e411796df3d2b8ae1c673f76d99911ce
---
M pywikibot/proofreadpage.py
1 file changed, 16 insertions(+), 0 deletions(-)
Approvals:
Mpaa: Looks good to me, approved
Xqt: Looks good to me, approved
jenkins-bot: Verified
diff --git a/pywikibot/proofreadpage.py b/pywikibot/proofreadpage.py
index 35ff2dd..56181aa 100644
--- a/pywikibot/proofreadpage.py
+++ b/pywikibot/proofreadpage.py
@@ -32,6 +32,7 @@
from functools import partial
from http import HTTPStatus
from typing import Any, Optional, Union
+from urllib.parse import unquote
from requests.exceptions import ReadTimeout
@@ -974,6 +975,7 @@
title = self._parse_redlink(href) # non-existing page
if title is None: # title not conforming to required format
continue
+ title = unquote(title)
else:
title = a_tag.get('title') # existing page
--
To view, visit
https://gerrit.wikimedia.org/r/c/pywikibot/core/+/863023
To unsubscribe, or for help writing mail filters, visit
https://gerrit.wikimedia.org/r/settings
Gerrit-Project: pywikibot/core
Gerrit-Branch: master
Gerrit-Change-Id: If1d6dfc0e411796df3d2b8ae1c673f76d99911ce
Gerrit-Change-Number: 863023
Gerrit-PatchSet: 2
Gerrit-Owner: Mpaa <mpaa.wiki(a)gmail.com>
Gerrit-Reviewer: Mpaa <mpaa.wiki(a)gmail.com>
Gerrit-Reviewer: Xqt <info(a)gno.de>
Gerrit-Reviewer: jenkins-bot
Gerrit-MessageType: merged