jenkins-bot submitted this change.
[bugfix]: unquote title for red-links in Index"
When getting titles from Index page HTML code, characters
can be url-encoded (e.g. when an "'" is present in the title page).
Unquote to obtain the correct page title.
Change-Id: If1d6dfc0e411796df3d2b8ae1c673f76d99911ce
---
M pywikibot/proofreadpage.py
1 file changed, 16 insertions(+), 0 deletions(-)
diff --git a/pywikibot/proofreadpage.py b/pywikibot/proofreadpage.py
index 35ff2dd..56181aa 100644
--- a/pywikibot/proofreadpage.py
+++ b/pywikibot/proofreadpage.py
@@ -32,6 +32,7 @@
from functools import partial
from http import HTTPStatus
from typing import Any, Optional, Union
+from urllib.parse import unquote
from requests.exceptions import ReadTimeout
@@ -974,6 +975,7 @@
title = self._parse_redlink(href) # non-existing page
if title is None: # title not conforming to required format
continue
+ title = unquote(title)
else:
title = a_tag.get('title') # existing page
To view, visit change 863023. To unsubscribe, or for help writing mail filters, visit settings.