jenkins-bot has submitted this change. ( https://gerrit.wikimedia.org/r/c/pywikibot/core/+/863023 )
Change subject: [bugfix]: unquote title for red-links in Index" ......................................................................
[bugfix]: unquote title for red-links in Index"
When getting titles from Index page HTML code, characters can be url-encoded (e.g. when an "'" is present in the title page).
Unquote to obtain the correct page title.
Change-Id: If1d6dfc0e411796df3d2b8ae1c673f76d99911ce --- M pywikibot/proofreadpage.py 1 file changed, 16 insertions(+), 0 deletions(-)
Approvals: Mpaa: Looks good to me, approved Xqt: Looks good to me, approved jenkins-bot: Verified
diff --git a/pywikibot/proofreadpage.py b/pywikibot/proofreadpage.py index 35ff2dd..56181aa 100644 --- a/pywikibot/proofreadpage.py +++ b/pywikibot/proofreadpage.py @@ -32,6 +32,7 @@ from functools import partial from http import HTTPStatus from typing import Any, Optional, Union +from urllib.parse import unquote
from requests.exceptions import ReadTimeout
@@ -974,6 +975,7 @@ title = self._parse_redlink(href) # non-existing page if title is None: # title not conforming to required format continue + title = unquote(title) else: title = a_tag.get('title') # existing page