jenkins-bot submitted this change.

View Change


Approvals: Mpaa: Looks good to me, approved Xqt: Looks good to me, approved jenkins-bot: Verified
[bugfix]: unquote title for red-links in Index"

When getting titles from Index page HTML code, characters
can be url-encoded (e.g. when an "'" is present in the title page).

Unquote to obtain the correct page title.

Change-Id: If1d6dfc0e411796df3d2b8ae1c673f76d99911ce
---
M pywikibot/proofreadpage.py
1 file changed, 16 insertions(+), 0 deletions(-)

diff --git a/pywikibot/proofreadpage.py b/pywikibot/proofreadpage.py
index 35ff2dd..56181aa 100644
--- a/pywikibot/proofreadpage.py
+++ b/pywikibot/proofreadpage.py
@@ -32,6 +32,7 @@
from functools import partial
from http import HTTPStatus
from typing import Any, Optional, Union
+from urllib.parse import unquote

from requests.exceptions import ReadTimeout

@@ -974,6 +975,7 @@
title = self._parse_redlink(href) # non-existing page
if title is None: # title not conforming to required format
continue
+ title = unquote(title)
else:
title = a_tag.get('title') # existing page


To view, visit change 863023. To unsubscribe, or for help writing mail filters, visit settings.

Gerrit-Project: pywikibot/core
Gerrit-Branch: master
Gerrit-Change-Id: If1d6dfc0e411796df3d2b8ae1c673f76d99911ce
Gerrit-Change-Number: 863023
Gerrit-PatchSet: 2
Gerrit-Owner: Mpaa <mpaa.wiki@gmail.com>
Gerrit-Reviewer: Mpaa <mpaa.wiki@gmail.com>
Gerrit-Reviewer: Xqt <info@gno.de>
Gerrit-Reviewer: jenkins-bot
Gerrit-MessageType: merged