jenkins-bot has submitted this change. ( https://gerrit.wikimedia.org/r/c/pywikibot/core/+/737201 )
Change subject: [doc] Update ROADMAP.rst
......................................................................
[doc] Update ROADMAP.rst
Change-Id: I0e2207b65781dedf2824806ccade17f147f3d87f
---
M ROADMAP.rst
1 file changed, 3 insertions(+), 0 deletions(-)
Approvals:
Xqt: Looks good to me, approved
jenkins-bot: Verified
diff --git a/ROADMAP.rst b/ROADMAP.rst
index 2c611c2..93acb6c 100644
--- a/ROADMAP.rst
+++ b/ROADMAP.rst
@@ -4,6 +4,8 @@
Improvements and Bugfixes
-------------------------
+* Add `title_delimiter_and_aliases` attribute to family files to support WikiHow family (T294761)
+* Only handle query limit if query module is limited (T294836)
* BaseBot has a public collections.Counter for reading, writing and skipping a page
* Upload: Retry upload if 'copyuploadbaddomain' API error occurs (T294825)
* Upload: Only set filekey/offset for files with names (T294916)
@@ -27,6 +29,7 @@
Code cleanups
-------------
+* Raise an Error exception if 'titles' is still used as where parameter in Site.search()
* Deprecated version.get_module_version() function was removed
* Deprecated setOptions/getOptions OptionHandler methods were removed
* Deprecated from_page() method of CosmeticChangesToolkit was removed
--
To view, visit https://gerrit.wikimedia.org/r/c/pywikibot/core/+/737201
To unsubscribe, or for help writing mail filters, visit https://gerrit.wikimedia.org/r/settings
Gerrit-Project: pywikibot/core
Gerrit-Branch: master
Gerrit-Change-Id: I0e2207b65781dedf2824806ccade17f147f3d87f
Gerrit-Change-Number: 737201
Gerrit-PatchSet: 1
Gerrit-Owner: Xqt <info(a)gno.de>
Gerrit-Reviewer: Xqt <info(a)gno.de>
Gerrit-Reviewer: jenkins-bot
Gerrit-MessageType: merged
jenkins-bot has submitted this change. ( https://gerrit.wikimedia.org/r/c/pywikibot/core/+/614565 )
Change subject: [bugfix] Add title_delimiter_and_alias into family files
......................................................................
[bugfix] Add title_delimiter_and_alias into family files
Titles usually are delimited by a space and the the alias is replaced
to this delimiter; e.g. "Main page" is the title with spaces as
delimiters but "Main_page" also works. Other families like wikihow has
a have setting.
- add title_delimiter_and_alias to family.py as default
- add a different title_delimiter_and_alias to wikihow_family.py
- use this settings for Page.Link and when comparing titles
- tests updated
Bug: T294761
Change-Id: Ib7858b88324376b6bbbf788893308fcf66c4d154
---
M pywikibot/families/wikihow_family.py
M pywikibot/family.py
M pywikibot/page/__init__.py
M pywikibot/site/_basesite.py
M tests/link_tests.py
5 files changed, 61 insertions(+), 12 deletions(-)
Approvals:
Xqt: Looks good to me, approved
jenkins-bot: Verified
diff --git a/pywikibot/families/wikihow_family.py b/pywikibot/families/wikihow_family.py
index 4aa93f7..38d3ab4 100644
--- a/pywikibot/families/wikihow_family.py
+++ b/pywikibot/families/wikihow_family.py
@@ -15,7 +15,7 @@
"""Family class for Wikihow Wiki.
- .. versionaddded: 3.0
+ .. versionadded:: 3.0
"""
name = 'wikihow'
@@ -25,8 +25,12 @@
'ar', 'cs', 'de', 'en', 'es', 'fr', 'hi', 'id', 'it', 'ja', 'ko', 'nl',
'pt', 'ru', 'th', 'tr', 'vi', 'zh',
)
+
removed_wikis = ['ca', 'cy', 'fa', 'he', 'pl', 'ur']
+ title_delimiter_and_aliases = '- '
+ """.. versionadded:: 7.0"""
+
@classproperty
def domains(cls):
"""List of domains used by family wikihow."""
diff --git a/pywikibot/family.py b/pywikibot/family.py
index 006c548..d5240a9 100644
--- a/pywikibot/family.py
+++ b/pywikibot/family.py
@@ -542,6 +542,21 @@
# site. This value can specify this last one with (lang, family) tuple.
shared_urlshortner_wiki = None # type: Optional[Tuple[str, str]]
+ title_delimiter_and_aliases = ' _'
+ """Titles usually are delimited by a space and the alias is replaced
+ to this delimiter; e.g. "Main page" is the title with spaces as
+ delimiters but "Main_page" also works. Other families may have
+ different settings.
+
+ .. note:: The first character is used as delimiter, the others are
+ aliases.
+
+ .. warning:: This attribute is used within ``re.sub()`` method. Use
+ escape sequence if necessary
+
+ .. versionadded:: 7.0
+ """
+
_families = {}
@staticmethod
diff --git a/pywikibot/page/__init__.py b/pywikibot/page/__init__.py
index 59d5781..56841d6 100644
--- a/pywikibot/page/__init__.py
+++ b/pywikibot/page/__init__.py
@@ -5249,6 +5249,8 @@
else:
self._anchor = None
+ self._text = self._text.strip()
+
# Convert URL-encoded characters to unicode
self._text = pywikibot.tools.chars.url2string(
self._text, encodings=self._source.encodings())
@@ -5267,9 +5269,11 @@
'{!r} contains illegal char {!r}'.format(t, '\ufffd'))
# Cleanup whitespace
+ sep = self._source.family.title_delimiter_and_aliases[0]
t = re.sub(
- '[_ \xa0\u1680\u180E\u2000-\u200A\u2028\u2029\u202F\u205F\u3000]+',
- ' ', t)
+ '[{}\xa0\u1680\u180E\u2000-\u200A\u2028\u2029\u202F\u205F\u3000]+'
+ .format(self._source.family.title_delimiter_and_aliases),
+ sep, t)
# Strip spaces at both ends
t = t.strip()
# Remove left-to-right and right-to-left markers.
diff --git a/pywikibot/site/_basesite.py b/pywikibot/site/_basesite.py
index 3a3a605..f92ae18 100644
--- a/pywikibot/site/_basesite.py
+++ b/pywikibot/site/_basesite.py
@@ -378,10 +378,14 @@
return default_ns, title
return ns, name
- # Replace underscores with spaces and multiple combinations of them
- # with only one space
- title1 = re.sub(r'[_ ]+', ' ', title1)
- title2 = re.sub(r'[_ ]+', ' ', title2)
+ # Replace alias characters like underscores with title
+ # delimiters like spaces and multiple combinations of them with
+ # only one delimiter
+ sep = self.family.title_delimiter_and_aliases[0]
+ pattern = re.compile('[{}]+'
+ .format(self.family.title_delimiter_and_aliases))
+ title1 = pattern.sub(sep, title1)
+ title2 = pattern.sub(sep, title2)
if title1 == title2:
return True
diff --git a/tests/link_tests.py b/tests/link_tests.py
index 220bebd..f8e9dbb 100644
--- a/tests/link_tests.py
+++ b/tests/link_tests.py
@@ -64,6 +64,19 @@
default site is using completely different namespaces.
"""
+ def replaced(self, iterable):
+ """Replace family specific title delimiter."""
+ for items in iterable:
+ if isinstance(items, str):
+ items = [items]
+ items = [re.sub(' ',
+ self.site.family.title_delimiter_and_aliases[0],
+ item)
+ for item in items]
+ if len(items) == 1:
+ items = items[0]
+ yield items
+
def test_valid(self):
"""Test that valid titles are correctly normalized."""
title_tests = ['Sandbox', 'A "B"', "A 'B'", '.com', '~', '"', "'",
@@ -87,11 +100,11 @@
site = self.get_site()
- for title in title_tests:
+ for title in self.replaced(title_tests):
with self.subTest(title=title):
self.assertEqual(Link(title, site).title, title)
- for link, title in extended_title_tests:
+ for link, title in self.replaced(extended_title_tests):
with self.subTest(link=link, title=title):
self.assertEqual(Link(link, site).title, title)
@@ -138,7 +151,7 @@
title_tests = [
# Empty title
- (['', ':', '__ __', ' __ '],
+ (['', ':'],
r'^The link \[\[.*\]\] does not contain a page title$'),
(['A [ B', 'A ] B', 'A { B', 'A } B', 'A < B', 'A > B'],
@@ -165,12 +178,21 @@
([('x' * 256), ('Invalid:' + 'X' * 248)],
generate_overlength_exc_regex),
- (['Talk:', 'Category: ', 'Category: #bar'],
+ (['Talk:'],
generate_has_no_title_exc_regex),
]
+ # Known issues with wikihow.
+ if self.site.family.name != 'wikihow':
+ title_tests.extend([
+ (['Category: ', 'Category: #bar'],
+ generate_has_no_title_exc_regex),
+ (['__ __', ' __ '],
+ r'^The link \[\[\]\] does not contain a page title$'),
+ ])
+
for texts_to_test, exception_regex in title_tests:
- for text in texts_to_test:
+ for text in self.replaced(texts_to_test):
with self.subTest(title=text):
if callable(exception_regex):
regex = exception_regex(text)
--
To view, visit https://gerrit.wikimedia.org/r/c/pywikibot/core/+/614565
To unsubscribe, or for help writing mail filters, visit https://gerrit.wikimedia.org/r/settings
Gerrit-Project: pywikibot/core
Gerrit-Branch: master
Gerrit-Change-Id: Ib7858b88324376b6bbbf788893308fcf66c4d154
Gerrit-Change-Number: 614565
Gerrit-PatchSet: 10
Gerrit-Owner: Xqt <info(a)gno.de>
Gerrit-Reviewer: Matěj Suchánek <matejsuchanek97(a)gmail.com>
Gerrit-Reviewer: Xqt <info(a)gno.de>
Gerrit-Reviewer: jenkins-bot
Gerrit-CC: Lgessler <lukegessler(a)gmail.com>
Gerrit-MessageType: merged
jenkins-bot has submitted this change. ( https://gerrit.wikimedia.org/r/c/pywikibot/core/+/279755 )
Change subject: [IMPR] Rewrite len and bool function for Namespace class
......................................................................
[IMPR] Rewrite len and bool function for Namespace class
- len() is defined as length of the iterable. The iterator uses
Namespace._distinct(). Thus len should use it too instead of calculate it.
This prevents the method could break when Namespace._distinct() or
Namespace.aliases or any other properties will be changed further.
- bool() is derived from len() method but it should be independent from it
and always return True like a generic object class.
- Test added.
Change-Id: I9f37e3d511042317c2c895655c2bb138ef46c18c
---
M pywikibot/site/_namespace.py
M tests/namespace_tests.py
2 files changed, 15 insertions(+), 3 deletions(-)
Approvals:
Matěj Suchánek: Looks good to me, but someone else must approve
Xqt: Looks good to me, approved
jenkins-bot: Verified
diff --git a/pywikibot/site/_namespace.py b/pywikibot/site/_namespace.py
index bafcb76..4f7adba 100644
--- a/pywikibot/site/_namespace.py
+++ b/pywikibot/site/_namespace.py
@@ -144,11 +144,20 @@
return self._contains_lowercase_name(name.lower())
+ def __bool__(self) -> bool:
+ """Obtain boolean method for Namepace class.
+
+ This method is implemented to be independent from __len__ method.
+
+ .. versionadded:: 7.0
+
+ :return: Always return True like generic object class.
+ """
+ return True
+
def __len__(self):
"""Obtain length of the iterable."""
- if self.custom_name == self.canonical_name:
- return len(self.aliases) + 1
- return len(self.aliases) + 2
+ return len(self._distinct())
def __iter__(self):
"""Return an iterator."""
diff --git a/tests/namespace_tests.py b/tests/namespace_tests.py
index 177e8ff..e1cb2d7 100644
--- a/tests/namespace_tests.py
+++ b/tests/namespace_tests.py
@@ -77,6 +77,9 @@
for val in ns.values()
for name in val))
+ # test boolean
+ self.assertTrue(all(x for x in ns.values()))
+
# Use a namespace object as a dict key
self.assertEqual(ns[ns[6]], ns[6])
--
To view, visit https://gerrit.wikimedia.org/r/c/pywikibot/core/+/279755
To unsubscribe, or for help writing mail filters, visit https://gerrit.wikimedia.org/r/settings
Gerrit-Project: pywikibot/core
Gerrit-Branch: master
Gerrit-Change-Id: I9f37e3d511042317c2c895655c2bb138ef46c18c
Gerrit-Change-Number: 279755
Gerrit-PatchSet: 5
Gerrit-Owner: Xqt <info(a)gno.de>
Gerrit-Reviewer: John Vandenberg <jayvdb(a)gmail.com>
Gerrit-Reviewer: Matěj Suchánek <matejsuchanek97(a)gmail.com>
Gerrit-Reviewer: Mpaa <mpaa.wiki(a)gmail.com>
Gerrit-Reviewer: Xqt <info(a)gno.de>
Gerrit-Reviewer: jenkins-bot
Gerrit-MessageType: merged
jenkins-bot has submitted this change. ( https://gerrit.wikimedia.org/r/c/pywikibot/core/+/735951 )
Change subject: [cleanup] Raise an exception if 'titles' is still used as where parameter
......................................................................
[cleanup] Raise an exception if 'titles' is still used as where parameter
'titles' value for where parameter was deprecated 5 years ago.
Change-Id: I0b9530be65db2a017cab5cbc4395cf47191a0dff
---
M pywikibot/site/_generators.py
1 file changed, 11 insertions(+), 15 deletions(-)
Approvals:
Matěj Suchánek: Looks good to me, but someone else must approve
Xqt: Looks good to me, approved
jenkins-bot: Verified
diff --git a/pywikibot/site/_generators.py b/pywikibot/site/_generators.py
index ca50567..66d03c9 100644
--- a/pywikibot/site/_generators.py
+++ b/pywikibot/site/_generators.py
@@ -1296,27 +1296,23 @@
:raises TypeError: a namespace identifier has an inappropriate
type such as NoneType or bool
"""
- where_types = ['nearmatch', 'text', 'title', 'titles']
+ where_types = ['nearmatch', 'text', 'title']
if not searchstring:
raise Error('search: searchstring cannot be empty')
if where not in where_types:
raise Error("search: unrecognized 'where' value: {}".format(where))
- if where in ('title', 'titles'):
- if where == 'titles':
- issue_deprecation_warning("where='titles'", "where='title'",
- since='20160224')
- where = 'title'
- if self.has_extension('CirrusSearch') and \
- isinstance(self.family, pywikibot.family.WikimediaFamily):
- # 'title' search was disabled, use intitle instead
- searchstring = 'intitle:' + searchstring
- issue_deprecation_warning(
- "where='{}'".format(where),
- "searchstring='{}'".format(searchstring),
- since='20160224')
+ if where == 'title' \
+ and self.has_extension('CirrusSearch') \
+ and isinstance(self.family, pywikibot.family.WikimediaFamily):
+ # 'title' search was disabled, use intitle instead
+ searchstring = 'intitle:' + searchstring
+ issue_deprecation_warning(
+ "where='{}'".format(where),
+ "searchstring='{}'".format(searchstring),
+ since='20160224')
- where = None # default
+ where = None # default
if not namespaces and namespaces != 0:
namespaces = [ns_id for ns_id in self.namespaces if ns_id >= 0]
--
To view, visit https://gerrit.wikimedia.org/r/c/pywikibot/core/+/735951
To unsubscribe, or for help writing mail filters, visit https://gerrit.wikimedia.org/r/settings
Gerrit-Project: pywikibot/core
Gerrit-Branch: master
Gerrit-Change-Id: I0b9530be65db2a017cab5cbc4395cf47191a0dff
Gerrit-Change-Number: 735951
Gerrit-PatchSet: 1
Gerrit-Owner: Xqt <info(a)gno.de>
Gerrit-Reviewer: Matěj Suchánek <matejsuchanek97(a)gmail.com>
Gerrit-Reviewer: Xqt <info(a)gno.de>
Gerrit-Reviewer: jenkins-bot
Gerrit-MessageType: merged
jenkins-bot has submitted this change. ( https://gerrit.wikimedia.org/r/c/pywikibot/core/+/736944 )
Change subject: [doc] Update ROADMAP.rst
......................................................................
[doc] Update ROADMAP.rst
Change-Id: Iefc1dce436f8403d7a8e2d8c99d99d291c314dfd
---
M ROADMAP.rst
1 file changed, 10 insertions(+), 0 deletions(-)
Approvals:
Xqt: Looks good to me, approved
jenkins-bot: Verified
diff --git a/ROADMAP.rst b/ROADMAP.rst
index 817a239..2c611c2 100644
--- a/ROADMAP.rst
+++ b/ROADMAP.rst
@@ -4,6 +4,9 @@
Improvements and Bugfixes
-------------------------
+* BaseBot has a public collections.Counter for reading, writing and skipping a page
+* Upload: Retry upload if 'copyuploadbaddomain' API error occurs (T294825)
+* Upload: Only set filekey/offset for files with names (T294916)
* Update invisible characters from unicodedata 14.0.0
* Make site parameter of textlib.replace_links() mandatory (T294649)
* Raise a generic ServerError if the http status code is unofficial (T293208)
@@ -24,6 +27,11 @@
Code cleanups
-------------
+* Deprecated version.get_module_version() function was removed
+* Deprecated setOptions/getOptions OptionHandler methods were removed
+* Deprecated from_page() method of CosmeticChangesToolkit was removed
+* Deprecated diff attribute of CosmeticChangesToolkit was removed in favour of show_diff
+* Deprecated namespace and pageTitle parameter of CosmeticChangesToolkit were removed
* Remove deprecated BaseSite namespace shortcuts
* Remove deprecated Family.get_cr_templates method in favour of Site.category_redirects()
* Remove deprecated Page.put_async() method (T193494)
@@ -66,6 +74,8 @@
Deprecations
^^^^^^^^^^^^
+* 7.0.0: Private BaseBot counters _treat_counter, _save_counter, _skip_counter will be removed in favour of collections.Counter counter attribute
+* 7.0.0: A boolean watch parameter in Page.save() is deprecated and will be desupported
* 7.0.0: baserevid parameter of editSource(), editQualifier(), removeClaims(), removeSources(), remove_qualifiers() DataSite methods will be removed
* 7.0.0: Values of APISite.allpages() parameter filterredir other than True, False and None are deprecated
* 6.5.0: OutputOption.output() method will be removed in favour of OutputOption.out property
--
To view, visit https://gerrit.wikimedia.org/r/c/pywikibot/core/+/736944
To unsubscribe, or for help writing mail filters, visit https://gerrit.wikimedia.org/r/settings
Gerrit-Project: pywikibot/core
Gerrit-Branch: master
Gerrit-Change-Id: Iefc1dce436f8403d7a8e2d8c99d99d291c314dfd
Gerrit-Change-Number: 736944
Gerrit-PatchSet: 1
Gerrit-Owner: Xqt <info(a)gno.de>
Gerrit-Reviewer: Xqt <info(a)gno.de>
Gerrit-Reviewer: jenkins-bot
Gerrit-MessageType: merged