jenkins-bot has submitted this change. ( https://gerrit.wikimedia.org/r/c/pywikibot/core/+/816832 )
Change subject: [IMPR] decrease memory usage and improve processing speed
......................................................................
[IMPR] decrease memory usage and improve processing speed
- Use a generator instead of a list of pages to process. This decreases
memory usage a lot and also speeds up start time by giving up sorting
all pages.
- preload the page contents
- catch KeyboardInterrupt and leave the mean loop
- add look & feel of CurrentPageBot
- print execution time finally
Change-Id: Ib3572e1cf76ce898bc238b6d40688c286603cdd9
---
M scripts/archivebot.py
1 file changed, 19 insertions(+), 17 deletions(-)
Approvals:
Xqt: Looks good to me, approved
jenkins-bot: Verified
diff --git a/scripts/archivebot.py b/scripts/archivebot.py
index e92448c..19ab8cf 100755
--- a/scripts/archivebot.py
+++ b/scripts/archivebot.py
@@ -815,26 +815,22 @@
return
for template_name in templates:
- pagelist = []
tmpl = pywikibot.Page(site, template_name, ns=10)
- if not filename and not pagename:
- if namespace is not None:
- ns = [str(namespace)]
- else:
- ns = []
- pywikibot.output('Fetching template transclusions...')
- pagelist.extend(tmpl.getReferences(only_template_inclusion=True,
- follow_redirects=False,
- namespaces=ns))
if filename:
with open(filename) as f:
- for pg in f.readlines():
- pagelist.append(pywikibot.Page(site, pg, ns=10))
- if pagename:
- pagelist.append(pywikibot.Page(site, pagename, ns=3))
- pagelist.sort()
- for pg in pagelist:
- pywikibot.output('Processing {}'.format(pg))
+ gen = [pywikibot.Page(site, line, ns=10) for line in f]
+ elif pagename:
+ gen = [pywikibot.Page(site, pagename, ns=3)]
+ else:
+ ns = [str(namespace)] if namespace is not None else []
+ pywikibot.output('Fetching template transclusions...')
+ gen = tmpl.getReferences(only_template_inclusion=True,
+ follow_redirects=False,
+ namespaces=ns,
+ content=True)
+ for pg in gen:
+ pywikibot.info('\n\n>>> <<lightpurple>>{}<<default>> <<<'
+ .format(pg.title()))
# Catching exceptions, so that errors in one page do not bail out
# the entire process
try:
@@ -847,7 +843,13 @@
except Exception:
pywikibot.exception('Error occurred while processing page {}'
.format(pg))
+ except KeyboardInterrupt:
+ pywikibot.info('\nUser quit bot run...')
+ return
if __name__ == '__main__':
+ start = datetime.datetime.now()
main()
+ pywikibot.info('\nExecution time: {} seconds'
+ .format((datetime.datetime.now() - start).seconds))
--
To view, visit https://gerrit.wikimedia.org/r/c/pywikibot/core/+/816832
To unsubscribe, or for help writing mail filters, visit https://gerrit.wikimedia.org/r/settings
Gerrit-Project: pywikibot/core
Gerrit-Branch: master
Gerrit-Change-Id: Ib3572e1cf76ce898bc238b6d40688c286603cdd9
Gerrit-Change-Number: 816832
Gerrit-PatchSet: 5
Gerrit-Owner: Xqt <info(a)gno.de>
Gerrit-Reviewer: D3r1ck01 <xsavitar.wiki(a)aol.com>
Gerrit-Reviewer: Matěj Suchánek <matejsuchanek97(a)gmail.com>
Gerrit-Reviewer: PotsdamLamb
Gerrit-Reviewer: Xqt <info(a)gno.de>
Gerrit-Reviewer: jenkins-bot
Gerrit-MessageType: merged
jenkins-bot has submitted this change. ( https://gerrit.wikimedia.org/r/c/pywikibot/core/+/816703 )
Change subject: [BUGFIX] Add localized "archive" variables to archivebot.py
......................................................................
[BUGFIX] Add localized "archive" variables to archivebot.py
- Non latin digits support introduced with
https://gerrit.wikimedia.org/r/c/pywikibot/core/+/163213
never worked because the variable replacements like %(counter)d
expected an int instead of s str. This did not fail as long as
textlib.to_local_digits returned an unchanged value if there are no
local digits given for a language but it might be failed for those
who have it. With 7.5 textlib.to_local_digits always return a str
and the archivebot failed. This was fixed recently with 7.5.1.
- User should be able to decide whether to use latin or non latin digits.
Therefore a lot for new fields were introduced like 'localcounter'
which uses the localized number instead of the latin one. This does
not break the further implementation due to the %d replacement
except in rare cases if the user had it replaced by %s already.
- Restore old values for non local fields
- Remove the 7.5.1 changes
- make a sanity check in analyze_page() method for the case that the
local fields are used with %d and show a warning in this case.
- Update some related documentatin
Bug: T71551
Bug: T313682
Bug: T313692
Change-Id: I05c165109aa49cfea40339f7fbdaff0150a62928
---
M pywikibot/page/_pages.py
M pywikibot/userinterfaces/transliteration.py
M scripts/archivebot.py
3 files changed, 71 insertions(+), 46 deletions(-)
Approvals:
Matěj Suchánek: Looks good to me, but someone else must approve
Xqt: Looks good to me, approved
jenkins-bot: Verified
diff --git a/pywikibot/page/_pages.py b/pywikibot/page/_pages.py
index 0d4f2be..d130a89 100644
--- a/pywikibot/page/_pages.py
+++ b/pywikibot/page/_pages.py
@@ -2145,9 +2145,10 @@
@property
@cached
def raw_extracted_templates(self):
- """
- Extract templates using :py:obj:`textlib.extract_templates_and_params`.
+ """Extract templates and parameters.
+ This method is using
+ :func:`pywikibot.textlib.extract_templates_and_params`.
Disabled parts and whitespace are stripped, except for
whitespace in anonymous positional arguments.
@@ -2156,13 +2157,11 @@
return textlib.extract_templates_and_params(self.text, True, True)
def templatesWithParams(self):
- """
- Return templates used on this Page.
+ """Return templates used on this Page.
- The templates are extracted by
- :py:obj:`textlib.extract_templates_and_params`, with positional
- arguments placed first in order, and each named argument
- appearing as 'name=value'.
+ The templates are extracted by :meth:`raw_extracted_templates`,
+ with positional arguments placed first in order, and each named
+ argument appearing as 'name=value'.
All parameter keys and values for each template are stripped of
whitespace.
diff --git a/pywikibot/userinterfaces/transliteration.py b/pywikibot/userinterfaces/transliteration.py
index 6377b64..c0e65ca 100644
--- a/pywikibot/userinterfaces/transliteration.py
+++ b/pywikibot/userinterfaces/transliteration.py
@@ -4,6 +4,7 @@
#
# Distributed under the terms of the MIT license.
#
+#: Non latin digits used by the framework
NON_LATIN_DIGITS = {
'bn': '০১২৩৪৫৬৭৮৯',
'ckb': '٠١٢٣٤٥٦٧٨٩',
@@ -19,6 +20,7 @@
'te': '౦౧౨౩౪౫౬౭౮౯',
}
+
_trans = {
'À': 'A', 'Á': 'A', 'Â': 'A', 'Ầ': 'A', 'Ấ': 'A', 'Ẫ': 'A', 'Ẩ': 'A',
'Ậ': 'A', 'Ã': 'A', 'Ā': 'A', 'Ă': 'A', 'Ằ': 'A', 'Ắ': 'A', 'Ẵ': 'A',
diff --git a/scripts/archivebot.py b/scripts/archivebot.py
index ecc33d6..e92448c 100755
--- a/scripts/archivebot.py
+++ b/scripts/archivebot.py
@@ -52,18 +52,33 @@
key A secret key that (if valid) allows archives not to be
subpages of the page being archived.
-Variables below can be used in the value for "archive" in the template above:
+Variables below can be used in the value for "archive" in the template
+above; numbers are latin digits:
-%(counter)s the current value of the counter
-%(year)s year of the thread being archived
-%(isoyear)s ISO year of the thread being archived
-%(isoweek)s ISO week number of the thread being archived
-%(semester)s semester term of the year of the thread being archived
-%(quarter)s quarter of the year of the thread being archived
-%(month)s month (as a number 1-12) of the thread being archived
+%(counter)d the current value of the counter
+%(year)d year of the thread being archived
+%(isoyear)d ISO year of the thread being archived
+%(isoweek)d ISO week number of the thread being archived
+%(semester)d semester term of the year of the thread being archived
+%(quarter)d quarter of the year of the thread being archived
+%(month)d month (as a number 1-12) of the thread being archived
%(monthname)s localized name of the month above
%(monthnameshort)s first three letters of the name above
-%(week)s week number of the thread being archived
+%(week)d week number of the thread being archived
+
+Alternatively you may use localized digits. This is only available for a
+few site languages. Refer :attr:`NON_LATIN_DIGITS
+<pywikibot.userinterfaces.transliteration.NON_LATIN_DIGITS>` whether
+there is a localized one:
+
+%(localcounter)s the current value of the counter
+%(localyear)s year of the thread being archived
+%(localisoyear)s ISO year of the thread being archived
+%(localisoweek)s ISO week number of the thread being archived
+%(localsemester)s semester term of the year of the thread being archived
+%(localquarter)s quarter of the year of the thread being archived
+%(localmonth)s month (as a number 1-12) of the thread being archived
+%(localweek)s week number of the thread being archived
The ISO calendar starts with the Monday of the week which has at least four
days in the new Gregorian calendar. If January 1st is between Monday and
@@ -87,9 +102,8 @@
-page:PAGE archive a single PAGE, default ns is a user talk page
-salt:SALT specify salt
-.. versionchanged:: 7.5.1
- string presentation type should be used for "archive" variable in the
- template to support non latin values
+.. versionchanged:: 7.6
+ Localized variables for "archive" template parameter are supported
"""
#
# (C) Pywikibot team, 2006-2022
@@ -104,6 +118,7 @@
from collections import OrderedDict, defaultdict
from hashlib import md5
from math import ceil
+from textwrap import fill
from typing import Any, Optional, Pattern
from warnings import warn
@@ -484,16 +499,10 @@
return self.get_attr('key') == hexdigest
def load_config(self) -> None:
- """Load and validate archiver template.
-
- .. versionchanged:: 7.5.1
- replace archive pattern fields to string conversion
- """
+ """Load and validate archiver template."""
pywikibot.info('Looking for: {{{{{}}}}} in {}'
.format(self.tpl.title(), self.page))
- fields = self.get_params(self.now, 0).keys() # dummy parameters
- pattern = re.compile(r'%(\((?:{})\))d'.format('|'.join(fields)))
for tpl, params in self.page.raw_extracted_templates:
try: # Check tpl name before comparing; it might be invalid.
tpl_page = pywikibot.Page(self.site, tpl, ns=10)
@@ -503,11 +512,7 @@
if tpl_page == self.tpl:
for item, value in params.items():
- # convert archive pattern fields to string
- # to support non latin digits
- if item == 'archive':
- value = pattern.sub(r'%\1s', value)
- self.set_attr(item.strip(), value.strip())
+ self.set_attr(item, value)
break
else:
raise MissingConfigError('Missing or malformed template')
@@ -562,20 +567,22 @@
def get_params(self, timestamp, counter: int) -> dict:
"""Make params for archiving template."""
lang = self.site.lang
- return {
- 'counter': to_local_digits(counter, lang),
- 'year': to_local_digits(timestamp.year, lang),
- 'isoyear': to_local_digits(timestamp.isocalendar()[0], lang),
- 'isoweek': to_local_digits(timestamp.isocalendar()[1], lang),
- 'semester': to_local_digits(int(ceil(timestamp.month / 6)), lang),
- 'quarter': to_local_digits(int(ceil(timestamp.month / 3)), lang),
- 'month': to_local_digits(timestamp.month, lang),
- 'monthname': self.month_num2orig_names[timestamp.month]['long'],
- 'monthnameshort': self.month_num2orig_names[
- timestamp.month]['short'],
- 'week': to_local_digits(
- int(time.strftime('%W', timestamp.timetuple())), lang),
+ params = {
+ 'counter': counter,
+ 'year': timestamp.year,
+ 'isoyear': timestamp.isocalendar()[0],
+ 'isoweek': timestamp.isocalendar()[1],
+ 'semester': int(ceil(timestamp.month / 6)),
+ 'quarter': int(ceil(timestamp.month / 3)),
+ 'month': timestamp.month,
+ 'week': int(time.strftime('%W', timestamp.timetuple())),
}
+ params.update({'local' + key: to_local_digits(value, lang)
+ for key, value in params.items()})
+ monthnames = self.month_num2orig_names[timestamp.month]
+ params['monthname'] = monthnames['long']
+ params['monthnameshort'] = monthnames['short']
+ return params
def analyze_page(self) -> Set[ShouldArchive]:
"""Analyze DiscussionPage."""
@@ -588,6 +595,9 @@
whys = set()
pywikibot.output('Processing {} threads'
.format(len(self.page.threads)))
+ fields = self.get_params(self.now, 0).keys() # dummy parameters
+ regex = re.compile(r'%(\((?:{})\))d'.format('|'.join(fields)))
+ stringpattern = regex.sub(r'%\1s', pattern)
for i, thread in enumerate(self.page.threads):
# TODO: Make an option so that unstamped (unsigned) posts get
# archived.
@@ -598,7 +608,21 @@
params = self.get_params(thread.timestamp, counter)
# this is actually just a dummy key to group the threads by
# "era" regardless of the counter and deal with it later
- key = pattern % params
+ try:
+ key = pattern % params
+ except TypeError as e:
+ if 'a real number is required' in str(e):
+ pywikibot.error(e)
+ pywikibot.info(
+ fill('<<lightblue>>Use string format field like '
+ '%(localfield)s instead of %(localfield)d. '
+ 'Trying to solve it...'))
+ pywikibot.info()
+ pattern = stringpattern
+ key = pattern % params
+ else:
+ raise MalformedConfigError(e)
+
threads_per_archive[key].append((i, thread))
whys.add(why) # xxx: we don't know if we ever archive anything
--
To view, visit https://gerrit.wikimedia.org/r/c/pywikibot/core/+/816703
To unsubscribe, or for help writing mail filters, visit https://gerrit.wikimedia.org/r/settings
Gerrit-Project: pywikibot/core
Gerrit-Branch: master
Gerrit-Change-Id: I05c165109aa49cfea40339f7fbdaff0150a62928
Gerrit-Change-Number: 816703
Gerrit-PatchSet: 5
Gerrit-Owner: Xqt <info(a)gno.de>
Gerrit-Reviewer: D3r1ck01 <xsavitar.wiki(a)aol.com>
Gerrit-Reviewer: Ladsgroup <Ladsgroup(a)gmail.com>
Gerrit-Reviewer: Matěj Suchánek <matejsuchanek97(a)gmail.com>
Gerrit-Reviewer: Mpaa <mpaa.wiki(a)gmail.com>
Gerrit-Reviewer: PotsdamLamb
Gerrit-Reviewer: Xqt <info(a)gno.de>
Gerrit-Reviewer: jenkins-bot
Gerrit-MessageType: merged
jenkins-bot has submitted this change. ( https://gerrit.wikimedia.org/r/c/pywikibot/core/+/816286 )
Change subject: [IMPR] Make GoogleSearchPageGenerator a abc.Generator
......................................................................
[IMPR] Make GoogleSearchPageGenerator a abc.Generator
- Derive GoogleSearchPageGenerator from tools.collections.GeneratorWrapper
- rename the __iter__ method to the generator property to be reused by
the Wrapper class
Bug: T313681
Change-Id: I8a861aa5021e773253ee5fc463bf935c766845ae
---
M pywikibot/pagegenerators/_generators.py
1 file changed, 14 insertions(+), 7 deletions(-)
Approvals:
Matěj Suchánek: Looks good to me, approved
jenkins-bot: Verified
diff --git a/pywikibot/pagegenerators/_generators.py b/pywikibot/pagegenerators/_generators.py
index f45ba45..c9f208e 100644
--- a/pywikibot/pagegenerators/_generators.py
+++ b/pywikibot/pagegenerators/_generators.py
@@ -31,6 +31,7 @@
from pywikibot.comms import http
from pywikibot.exceptions import APIError, ServerError
from pywikibot.tools import deprecated
+from pywikibot.tools.collections import GeneratorWrapper
from pywikibot.tools.itertools import filter_unique, itergroup
@@ -780,9 +781,8 @@
# following classes just ported from version 1 without revision; not tested
-class GoogleSearchPageGenerator(Iterable['pywikibot.page.Page']):
- """
- Page generator using Google search results.
+class GoogleSearchPageGenerator(GeneratorWrapper):
+ """Page generator using Google search results.
To use this generator, you need to install the package 'google':
@@ -793,6 +793,9 @@
As there are concerns about Google's Terms of Service, this
generator prints a warning for each query.
+
+ .. versionchanged:: 7.6
+ subclassed from :class:`pywikibot.tools.collections.GeneratorWrapper`
"""
def __init__(self, query: Optional[str] = None,
@@ -834,11 +837,15 @@
pywikibot.warning('Please read http://www.google.com/accounts/TOS')
yield from google.search(query)
- def __iter__(self):
- """Iterate results.
+ @property
+ def generator(self) -> Iterator['pywikibot.page.Page']:
+ """Yield results from :meth:`queryGoogle` query.
Google contains links in the format:
https://de.wikipedia.org/wiki/en:Foobar
+
+ .. versionchanged:: 7.6
+ changed from iterator method to generator property
"""
# restrict query to local site
local_query = '{} site:{}'.format(self.query, self.site.hostname())
@@ -894,7 +901,7 @@
class XMLDumpPageGenerator(abc.Iterator): # type: ignore[type-arg]
- """Xml generator that yields Page objects.
+ """Xml iterator that yields Page objects.
.. versionadded:: 7.2
the `content` parameter
@@ -955,7 +962,7 @@
@deprecated('XMLDumpPageGenerator with content=True parameter', since='7.2.0')
class XMLDumpOldPageGenerator(XMLDumpPageGenerator):
- """Xml generator that yields Page objects with old text loaded.
+ """Xml iterator that yields Page objects with old text loaded.
.. deprecated:: 7.2
:class:`XMLDumpPageGenerator` with `content` parameter should be
--
To view, visit https://gerrit.wikimedia.org/r/c/pywikibot/core/+/816286
To unsubscribe, or for help writing mail filters, visit https://gerrit.wikimedia.org/r/settings
Gerrit-Project: pywikibot/core
Gerrit-Branch: master
Gerrit-Change-Id: I8a861aa5021e773253ee5fc463bf935c766845ae
Gerrit-Change-Number: 816286
Gerrit-PatchSet: 6
Gerrit-Owner: Xqt <info(a)gno.de>
Gerrit-Reviewer: Matěj Suchánek <matejsuchanek97(a)gmail.com>
Gerrit-Reviewer: jenkins-bot
Gerrit-MessageType: merged
jenkins-bot has submitted this change. ( https://gerrit.wikimedia.org/r/c/pywikibot/core/+/816346 )
Change subject: [BUGFIX] replace archive pattern fields to string conversion
......................................................................
[BUGFIX] replace archive pattern fields to string conversion
use string conversion format fields to support non latin numbers
introduced with https://gerrit.wikimedia.org/r/c/pywikibot/core/+/163213
backported from master branch
Bug: T313692
Change-Id: Ic543ad607d35a68cdde04a18653804996f96fdb2
---
M docs/requirements-py3.txt
M pywikibot/__metadata__.py
M scripts/archivebot.py
3 files changed, 35 insertions(+), 18 deletions(-)
Approvals:
Xqt: Looks good to me, approved
jenkins-bot: Verified
diff --git a/docs/requirements-py3.txt b/docs/requirements-py3.txt
index c213a4a..7c9fd27 100644
--- a/docs/requirements-py3.txt
+++ b/docs/requirements-py3.txt
@@ -1,4 +1,4 @@
# This is a PIP requirements file for building Sphinx documentation of pywikibot
# requirements.txt is also needed
-sphinx >= 4.5.0,!=5.0.0,!=5.0.1,!=5.0.2
\ No newline at end of file
+sphinx == 4.5.0
\ No newline at end of file
diff --git a/pywikibot/__metadata__.py b/pywikibot/__metadata__.py
index 0c0e1f8..21e0e01 100644
--- a/pywikibot/__metadata__.py
+++ b/pywikibot/__metadata__.py
@@ -11,7 +11,7 @@
__name__ = 'pywikibot'
-__version__ = '7.5.0'
+__version__ = '7.5.1'
__description__ = 'Python MediaWiki Bot Framework'
__maintainer__ = 'The Pywikibot team'
__maintainer_email__ = 'pywikibot(a)lists.wikimedia.org'
diff --git a/scripts/archivebot.py b/scripts/archivebot.py
index 4ab18e6..ecc33d6 100755
--- a/scripts/archivebot.py
+++ b/scripts/archivebot.py
@@ -54,16 +54,16 @@
Variables below can be used in the value for "archive" in the template above:
-%(counter)d the current value of the counter
-%(year)d year of the thread being archived
-%(isoyear)d ISO year of the thread being archived
-%(isoweek)d ISO week number of the thread being archived
-%(semester)d semester term of the year of the thread being archived
-%(quarter)d quarter of the year of the thread being archived
-%(month)d month (as a number 1-12) of the thread being archived
+%(counter)s the current value of the counter
+%(year)s year of the thread being archived
+%(isoyear)s ISO year of the thread being archived
+%(isoweek)s ISO week number of the thread being archived
+%(semester)s semester term of the year of the thread being archived
+%(quarter)s quarter of the year of the thread being archived
+%(month)s month (as a number 1-12) of the thread being archived
%(monthname)s localized name of the month above
%(monthnameshort)s first three letters of the name above
-%(week)d week number of the thread being archived
+%(week)s week number of the thread being archived
The ISO calendar starts with the Monday of the week which has at least four
days in the new Gregorian calendar. If January 1st is between Monday and
@@ -86,6 +86,10 @@
-namespace:NS only archive pages from a given namespace
-page:PAGE archive a single PAGE, default ns is a user talk page
-salt:SALT specify salt
+
+.. versionchanged:: 7.5.1
+ string presentation type should be used for "archive" variable in the
+ template to support non latin values
"""
#
# (C) Pywikibot team, 2006-2022
@@ -434,7 +438,6 @@
self.maxsize = 2096128 # 2 MB - 1 KB gap
self.page = DiscussionPage(page, self)
- self.load_config()
self.comment_params = {
'from': self.page.title(),
}
@@ -444,6 +447,7 @@
self.month_num2orig_names = {}
for n, (long, short) in enumerate(self.site.months_names, start=1):
self.month_num2orig_names[n] = {'long': long, 'short': short}
+ self.load_config()
def get_attr(self, attr, default='') -> Any:
"""Get an archiver attribute."""
@@ -480,25 +484,38 @@
return self.get_attr('key') == hexdigest
def load_config(self) -> None:
- """Load and validate archiver template."""
- pywikibot.output('Looking for: {{{{{}}}}} in {}'.format(
- self.tpl.title(), self.page))
+ """Load and validate archiver template.
+
+ .. versionchanged:: 7.5.1
+ replace archive pattern fields to string conversion
+ """
+ pywikibot.info('Looking for: {{{{{}}}}} in {}'
+ .format(self.tpl.title(), self.page))
+
+ fields = self.get_params(self.now, 0).keys() # dummy parameters
+ pattern = re.compile(r'%(\((?:{})\))d'.format('|'.join(fields)))
for tpl, params in self.page.raw_extracted_templates:
try: # Check tpl name before comparing; it might be invalid.
tpl_page = pywikibot.Page(self.site, tpl, ns=10)
tpl_page.title()
except Error:
continue
+
if tpl_page == self.tpl:
for item, value in params.items():
+ # convert archive pattern fields to string
+ # to support non latin digits
+ if item == 'archive':
+ value = pattern.sub(r'%\1s', value)
self.set_attr(item.strip(), value.strip())
break
else:
raise MissingConfigError('Missing or malformed template')
- if not self.get_attr('algo', ''):
- raise MissingConfigError('Missing argument "algo" in template')
- if not self.get_attr('archive', ''):
- raise MissingConfigError('Missing argument "archive" in template')
+
+ for field in ('algo', 'archive'):
+ if not self.get_attr(field, ''):
+ raise MissingConfigError('Missing argument {!r} in template'
+ .format(field))
def should_archive_thread(self, thread: DiscussionThread
) -> Optional[ShouldArchive]:
--
To view, visit https://gerrit.wikimedia.org/r/c/pywikibot/core/+/816346
To unsubscribe, or for help writing mail filters, visit https://gerrit.wikimedia.org/r/settings
Gerrit-Project: pywikibot/core
Gerrit-Branch: stable
Gerrit-Change-Id: Ic543ad607d35a68cdde04a18653804996f96fdb2
Gerrit-Change-Number: 816346
Gerrit-PatchSet: 2
Gerrit-Owner: Xqt <info(a)gno.de>
Gerrit-Reviewer: D3r1ck01 <xsavitar.wiki(a)aol.com>
Gerrit-Reviewer: Xqt <info(a)gno.de>
Gerrit-Reviewer: jenkins-bot
Gerrit-MessageType: merged
jenkins-bot has submitted this change. ( https://gerrit.wikimedia.org/r/c/pywikibot/core/+/816323 )
Change subject: [BUGFIX] replace archive pattern fields to string conversion
......................................................................
[BUGFIX] replace archive pattern fields to string conversion
use string conversion format fields to support non latin numbers
introduced with https://gerrit.wikimedia.org/r/c/pywikibot/core/+/163213
Bug: T313692
Change-Id: I205b8b86964887d6af184ecf058d4e50d5ed6fb3
---
M scripts/archivebot.py
1 file changed, 33 insertions(+), 16 deletions(-)
Approvals:
PotsdamLamb: Looks good to me, but someone else must approve
Xqt: Looks good to me, approved
jenkins-bot: Verified
diff --git a/scripts/archivebot.py b/scripts/archivebot.py
index 4ab18e6..ecc33d6 100755
--- a/scripts/archivebot.py
+++ b/scripts/archivebot.py
@@ -54,16 +54,16 @@
Variables below can be used in the value for "archive" in the template above:
-%(counter)d the current value of the counter
-%(year)d year of the thread being archived
-%(isoyear)d ISO year of the thread being archived
-%(isoweek)d ISO week number of the thread being archived
-%(semester)d semester term of the year of the thread being archived
-%(quarter)d quarter of the year of the thread being archived
-%(month)d month (as a number 1-12) of the thread being archived
+%(counter)s the current value of the counter
+%(year)s year of the thread being archived
+%(isoyear)s ISO year of the thread being archived
+%(isoweek)s ISO week number of the thread being archived
+%(semester)s semester term of the year of the thread being archived
+%(quarter)s quarter of the year of the thread being archived
+%(month)s month (as a number 1-12) of the thread being archived
%(monthname)s localized name of the month above
%(monthnameshort)s first three letters of the name above
-%(week)d week number of the thread being archived
+%(week)s week number of the thread being archived
The ISO calendar starts with the Monday of the week which has at least four
days in the new Gregorian calendar. If January 1st is between Monday and
@@ -86,6 +86,10 @@
-namespace:NS only archive pages from a given namespace
-page:PAGE archive a single PAGE, default ns is a user talk page
-salt:SALT specify salt
+
+.. versionchanged:: 7.5.1
+ string presentation type should be used for "archive" variable in the
+ template to support non latin values
"""
#
# (C) Pywikibot team, 2006-2022
@@ -434,7 +438,6 @@
self.maxsize = 2096128 # 2 MB - 1 KB gap
self.page = DiscussionPage(page, self)
- self.load_config()
self.comment_params = {
'from': self.page.title(),
}
@@ -444,6 +447,7 @@
self.month_num2orig_names = {}
for n, (long, short) in enumerate(self.site.months_names, start=1):
self.month_num2orig_names[n] = {'long': long, 'short': short}
+ self.load_config()
def get_attr(self, attr, default='') -> Any:
"""Get an archiver attribute."""
@@ -480,25 +484,38 @@
return self.get_attr('key') == hexdigest
def load_config(self) -> None:
- """Load and validate archiver template."""
- pywikibot.output('Looking for: {{{{{}}}}} in {}'.format(
- self.tpl.title(), self.page))
+ """Load and validate archiver template.
+
+ .. versionchanged:: 7.5.1
+ replace archive pattern fields to string conversion
+ """
+ pywikibot.info('Looking for: {{{{{}}}}} in {}'
+ .format(self.tpl.title(), self.page))
+
+ fields = self.get_params(self.now, 0).keys() # dummy parameters
+ pattern = re.compile(r'%(\((?:{})\))d'.format('|'.join(fields)))
for tpl, params in self.page.raw_extracted_templates:
try: # Check tpl name before comparing; it might be invalid.
tpl_page = pywikibot.Page(self.site, tpl, ns=10)
tpl_page.title()
except Error:
continue
+
if tpl_page == self.tpl:
for item, value in params.items():
+ # convert archive pattern fields to string
+ # to support non latin digits
+ if item == 'archive':
+ value = pattern.sub(r'%\1s', value)
self.set_attr(item.strip(), value.strip())
break
else:
raise MissingConfigError('Missing or malformed template')
- if not self.get_attr('algo', ''):
- raise MissingConfigError('Missing argument "algo" in template')
- if not self.get_attr('archive', ''):
- raise MissingConfigError('Missing argument "archive" in template')
+
+ for field in ('algo', 'archive'):
+ if not self.get_attr(field, ''):
+ raise MissingConfigError('Missing argument {!r} in template'
+ .format(field))
def should_archive_thread(self, thread: DiscussionThread
) -> Optional[ShouldArchive]:
--
To view, visit https://gerrit.wikimedia.org/r/c/pywikibot/core/+/816323
To unsubscribe, or for help writing mail filters, visit https://gerrit.wikimedia.org/r/settings
Gerrit-Project: pywikibot/core
Gerrit-Branch: master
Gerrit-Change-Id: I205b8b86964887d6af184ecf058d4e50d5ed6fb3
Gerrit-Change-Number: 816323
Gerrit-PatchSet: 1
Gerrit-Owner: Xqt <info(a)gno.de>
Gerrit-Reviewer: D3r1ck01 <xsavitar.wiki(a)aol.com>
Gerrit-Reviewer: PotsdamLamb
Gerrit-Reviewer: Xqt <info(a)gno.de>
Gerrit-Reviewer: jenkins-bot
Gerrit-MessageType: merged