jenkins-bot has submitted this change. ( https://gerrit.wikimedia.org/r/c/pywikibot/core/+/822038 )
Change subject: [tests] Add archivebot tests to auto_run_script_list
......................................................................
[tests] Add archivebot tests to auto_run_script_list
Change-Id: I999a2a2d1a038d4d20a86f70cc41b84cc4d195b0
---
M tests/script_tests.py
1 file changed, 2 insertions(+), 0 deletions(-)
Approvals:
Xqt: Looks good to me, approved
jenkins-bot: Verified
diff --git a/tests/script_tests.py b/tests/script_tests.py
index 0080f53..cbed98f 100755
--- a/tests/script_tests.py
+++ b/tests/script_tests.py
@@ -76,6 +76,7 @@
}
auto_run_script_list = [
+ 'archivebot',
'blockpageschecker',
'category_redirect',
'checkimages',
@@ -99,6 +100,7 @@
# Some of these are not pretty, but at least they are informative
# and not backtraces starting deep in the pywikibot package.
no_args_expected_results = {
+ 'archivebot': 'No template was spcified, using default',
# TODO: until done here, remember to set editor = None in user-config.py
'change_pagelang': 'No -setlang parameter given',
'checkimages': 'Execution time: 0 seconds',
--
To view, visit https://gerrit.wikimedia.org/r/c/pywikibot/core/+/822038
To unsubscribe, or for help writing mail filters, visit https://gerrit.wikimedia.org/r/settings
Gerrit-Project: pywikibot/core
Gerrit-Branch: master
Gerrit-Change-Id: I999a2a2d1a038d4d20a86f70cc41b84cc4d195b0
Gerrit-Change-Number: 822038
Gerrit-PatchSet: 1
Gerrit-Owner: Xqt <info(a)gno.de>
Gerrit-Reviewer: Xqt <info(a)gno.de>
Gerrit-Reviewer: jenkins-bot
Gerrit-MessageType: merged
jenkins-bot has submitted this change. ( https://gerrit.wikimedia.org/r/c/pywikibot/core/+/821799 )
Change subject: [bugfix] timestripper should skip HTML elements
......................................................................
[bugfix] timestripper should skip HTML elements
Remove HTML elements before searching for timestamp in text.
Also fix isort check.
Bug: T302496
Change-Id: Iad70c4dd803fd40aac6f8d100c80512a876ea724
---
M pywikibot/textlib.py
M tests/timestripper_tests.py
2 files changed, 12 insertions(+), 1 deletion(-)
Approvals:
Xqt: Looks good to me, approved
jenkins-bot: Verified
diff --git a/pywikibot/textlib.py b/pywikibot/textlib.py
index cc94eaf..d6eab05 100644
--- a/pywikibot/textlib.py
+++ b/pywikibot/textlib.py
@@ -25,7 +25,7 @@
from pywikibot.exceptions import InvalidTitleError, SiteDefinitionError
from pywikibot.family import Family
from pywikibot.time import TZoneFixedOffset
-from pywikibot.tools import deprecated, ModuleDeprecationWrapper
+from pywikibot.tools import ModuleDeprecationWrapper, deprecated
from pywikibot.userinterfaces.transliteration import NON_LATIN_DIGITS
@@ -2016,6 +2016,7 @@
# Remove parts that are not supposed to contain the timestamp, in order
# to reduce false positives.
line = removeDisabledParts(line)
+ line = removeHTMLParts(line)
line = to_latin_digits(line)
for pat in self.patterns:
diff --git a/tests/timestripper_tests.py b/tests/timestripper_tests.py
index b178bac..4929296 100755
--- a/tests/timestripper_tests.py
+++ b/tests/timestripper_tests.py
@@ -377,6 +377,16 @@
txt_match = self.date[:9] + '[[foo]]' + self.date[9:]
self.assertEqual(ts(txt_match), self.expected_date)
+ def test_timestripper_skip_html(self):
+ """Test dates in html are correctly skipped."""
+ ts = self.ts.timestripper
+
+ txt_match = '<div ' + self.fake_date + '>'
+ self.assertIsNone(ts(txt_match))
+
+ txt_match = self.date + '<div ' + self.fake_date + '>'
+ self.assertEqual(ts(txt_match), self.expected_date)
+
class TestTimeStripperDoNotArchiveUntil(TestTimeStripperCase):
--
To view, visit https://gerrit.wikimedia.org/r/c/pywikibot/core/+/821799
To unsubscribe, or for help writing mail filters, visit https://gerrit.wikimedia.org/r/settings
Gerrit-Project: pywikibot/core
Gerrit-Branch: master
Gerrit-Change-Id: Iad70c4dd803fd40aac6f8d100c80512a876ea724
Gerrit-Change-Number: 821799
Gerrit-PatchSet: 2
Gerrit-Owner: Mpaa <mpaa.wiki(a)gmail.com>
Gerrit-Reviewer: Xqt <info(a)gno.de>
Gerrit-Reviewer: jenkins-bot
Gerrit-MessageType: merged
Xqt has submitted this change. ( https://gerrit.wikimedia.org/r/c/pywikibot/core/+/821796 )
Change subject: [IMPR] use backported pairwise in archivebot.py
......................................................................
[IMPR] use backported pairwise in archivebot.py
Function itertool.pairwise has been introduced in 3.10.
Change-Id: I7bdd71920855df782aab1a66ca2bde2f1976c484
---
M scripts/archivebot.py
1 file changed, 3 insertions(+), 4 deletions(-)
Approvals:
Xqt: Verified; Looks good to me, approved
diff --git a/scripts/archivebot.py b/scripts/archivebot.py
index e05e565..a756bbd 100755
--- a/scripts/archivebot.py
+++ b/scripts/archivebot.py
@@ -115,7 +115,6 @@
# Distributed under the terms of the MIT license.
#
import datetime
-import itertools
import locale
import os
import re
@@ -129,7 +128,7 @@
import pywikibot
from pywikibot import i18n
-from pywikibot.backports import List, Set, Tuple
+from pywikibot.backports import List, Set, Tuple, pairwise
from pywikibot.exceptions import Error, NoPageError
from pywikibot.textlib import (
TimeStripper,
@@ -138,7 +137,7 @@
findmarker,
to_local_digits,
)
-from pywikibot.time import parse_duration, str2timedelta, MW_KEYS
+from pywikibot.time import MW_KEYS, parse_duration, str2timedelta
ShouldArchive = Tuple[str, str]
@@ -392,7 +391,7 @@
if self.keep:
# set the timestamp to the previous if the current is lower
- for first, second in itertools.pairwise(self.threads):
+ for first, second in pairwise(self.threads):
second.timestamp = self.max(first.timestamp, second.timestamp)
# This extra info is not desirable when run under the unittest
--
To view, visit https://gerrit.wikimedia.org/r/c/pywikibot/core/+/821796
To unsubscribe, or for help writing mail filters, visit https://gerrit.wikimedia.org/r/settings
Gerrit-Project: pywikibot/core
Gerrit-Branch: master
Gerrit-Change-Id: I7bdd71920855df782aab1a66ca2bde2f1976c484
Gerrit-Change-Number: 821796
Gerrit-PatchSet: 1
Gerrit-Owner: Mpaa <mpaa.wiki(a)gmail.com>
Gerrit-Reviewer: Xqt <info(a)gno.de>
Gerrit-Reviewer: jenkins-bot
Gerrit-MessageType: merged
jenkins-bot has submitted this change. ( https://gerrit.wikimedia.org/r/c/pywikibot/core/+/820889 )
Change subject: [IMPR] use User:MiszaBot/config as default template
......................................................................
[IMPR] use User:MiszaBot/config as default template
Change-Id: I697a73736f21f7947b99403877d2f16fbe7f665b
---
M scripts/archivebot.py
1 file changed, 14 insertions(+), 13 deletions(-)
Approvals:
Matěj Suchánek: Looks good to me, approved
jenkins-bot: Verified
diff --git a/scripts/archivebot.py b/scripts/archivebot.py
index 8b3100d..c2bf860 100755
--- a/scripts/archivebot.py
+++ b/scripts/archivebot.py
@@ -1,18 +1,19 @@
#!/usr/bin/python3
-"""
-archivebot.py - discussion page archiving bot.
+"""archivebot.py - discussion page archiving bot.
usage:
- python pwb.py archivebot [OPTIONS] TEMPLATE_PAGE
+ python pwb.py archivebot [OPTIONS] [TEMPLATE_PAGE]
-Bot examines backlinks (Special:WhatLinksHere) to TEMPLATE_PAGE.
-Then goes through all pages (unless a specific page specified using options)
-and archives old discussions. This is done by breaking a page into threads,
-then scanning each thread for timestamps. Threads older than a specified
-threshold are then moved to another page (the archive), which can be named
-either basing on the thread's name or then name can contain a counter which
-will be incremented when the archive reaches a certain size.
+Several TEMPLATE_PAGE templates can be given at once. Default is
+`User:MiszaBot/config`. Bot examines backlinks (Special:WhatLinksHere)
+to all TEMPLATE_PAGE templates. Then goes through all pages (unless a
+specific page specified using options) and archives old discussions.
+This is done by breaking a page into threads, then scanning each thread
+for timestamps. Threads older than a specified threshold are then moved
+to another page (the archive), which can be named either basing on the
+thread's name or then name can contain a counter which will be
+incremented when the archive reaches a certain size.
Transcluded template may contain the following parameters:
@@ -891,9 +892,9 @@
return
if not templates:
- pywikibot.bot.suggest_help(
- additional_text='No template was specified.')
- return
+ templates = ['User:MiszaBot/config']
+ pywikibot.info('No template was specified, using default {{{{{}}}}}.'
+ .format(templates[0]))
for template_name in templates:
tmpl = pywikibot.Page(site, template_name, ns=10)
--
To view, visit https://gerrit.wikimedia.org/r/c/pywikibot/core/+/820889
To unsubscribe, or for help writing mail filters, visit https://gerrit.wikimedia.org/r/settings
Gerrit-Project: pywikibot/core
Gerrit-Branch: master
Gerrit-Change-Id: I697a73736f21f7947b99403877d2f16fbe7f665b
Gerrit-Change-Number: 820889
Gerrit-PatchSet: 1
Gerrit-Owner: Xqt <info(a)gno.de>
Gerrit-Reviewer: Matěj Suchánek <matejsuchanek97(a)gmail.com>
Gerrit-Reviewer: Whym <whym(a)whym.org>
Gerrit-Reviewer: jenkins-bot
Gerrit-MessageType: merged
jenkins-bot has submitted this change. ( https://gerrit.wikimedia.org/r/c/pywikibot/core/+/820746 )
Change subject: [IMPR] backport pairwise() from Python 3.10
......................................................................
[IMPR] backport pairwise() from Python 3.10
Also use pairwise to iterate pairs of elements
Change-Id: I3fde479978960a7b033719ac5c5a26185c5cdd43
---
M pywikibot/backports.py
M scripts/claimit.py
M scripts/replace.py
M scripts/template.py
4 files changed, 34 insertions(+), 18 deletions(-)
Approvals:
Mpaa: Looks good to me, approved
jenkins-bot: Verified
diff --git a/pywikibot/backports.py b/pywikibot/backports.py
index d5a0ba2..9773ac6 100644
--- a/pywikibot/backports.py
+++ b/pywikibot/backports.py
@@ -123,3 +123,19 @@
if string.endswith(suffix):
return string[:-len(suffix)]
return string
+
+
+# bpo-38200
+if PYTHON_VERSION >= (3, 10):
+ from itertools import pairwise
+else:
+ from itertools import tee
+
+ def pairwise(iterable):
+ """Return successive overlapping pairs taken from the input iterable.
+
+ .. versionadded:: 7.6
+ """
+ a, b = tee(iterable)
+ next(b, None)
+ return zip(a, b)
diff --git a/scripts/claimit.py b/scripts/claimit.py
index 3e1ac09..4fff795 100755
--- a/scripts/claimit.py
+++ b/scripts/claimit.py
@@ -52,6 +52,7 @@
#
import pywikibot
from pywikibot import WikidataBot, pagegenerators
+from pywikibot.backports import pairwise
# This is required for the text that is shown when you run this script
@@ -126,15 +127,15 @@
claims = []
repo = pywikibot.Site().data_repository()
- for i in range(0, len(commandline_claims), 2):
- claim = pywikibot.Claim(repo, commandline_claims[i])
+ for source_str, target_str in pairwise(commandline_claims):
+ claim = pywikibot.Claim(repo, source_str)
if claim.type == 'wikibase-item':
- target = pywikibot.ItemPage(repo, commandline_claims[i + 1])
+ target = pywikibot.ItemPage(repo, target_str)
elif claim.type == 'string':
- target = commandline_claims[i + 1]
+ target = target_str
elif claim.type == 'globe-coordinate':
coord_args = [
- float(c) for c in commandline_claims[i + 1].split(',')]
+ float(c) for c in target_str.split(',')]
if len(coord_args) >= 3:
precision = coord_args[2]
else:
diff --git a/scripts/replace.py b/scripts/replace.py
index ac97270..703dd5c 100755
--- a/scripts/replace.py
+++ b/scripts/replace.py
@@ -154,7 +154,7 @@
import pywikibot
from pywikibot import editor, fixes, i18n, pagegenerators, textlib
-from pywikibot.backports import Dict, Generator, List, Pattern, Tuple
+from pywikibot.backports import pairwise, Dict, Generator, List, Pattern, Tuple
from pywikibot.bot import ExistingPageBot, SingleSiteBot
from pywikibot.exceptions import InvalidPageError, NoPageError
from pywikibot.tools import chars
@@ -979,9 +979,9 @@
# The summary stored here won't be actually used but is only an example
site = pywikibot.Site()
single_summary = None
- for i in range(0, len(commandline_replacements), 2):
- replacement = Replacement(commandline_replacements[i],
- commandline_replacements[i + 1])
+
+ for old, new in pairwise(commandline_replacements):
+ replacement = Replacement(old, new)
if not single_summary:
single_summary = i18n.twtranslate(
site, 'replace-replacing',
diff --git a/scripts/template.py b/scripts/template.py
index 7e83f71..58421bd 100755
--- a/scripts/template.py
+++ b/scripts/template.py
@@ -113,6 +113,7 @@
import pywikibot
from pywikibot import i18n, pagegenerators, textlib
+from pywikibot.backports import pairwise
from pywikibot.bot import SingleSiteBot
from pywikibot.pagegenerators import XMLDumpPageGenerator
from pywikibot.tools.itertools import filter_unique, roundrobin_generators
@@ -215,7 +216,6 @@
:param args: command line arguments
"""
template_names = []
- templates = {}
options = {}
# If xmlfilename is None, references will be loaded from the live wiki.
xmlfilename = None
@@ -266,17 +266,16 @@
return
if bool(options.get('subst', False)) ^ options.get('remove', False):
- for template_name in template_names:
- templates[template_name] = None
+ templates = {name: None for name in template_names}
else:
- try:
- for i in range(0, len(template_names), 2):
- templates[template_names[i]] = template_names[i + 1]
- except IndexError:
- pywikibot.output('Unless using solely -subst or -remove, '
- 'you must give an even number of template names.')
+ if len(template_names) % 2:
+ pywikibot.warning('Unless using solely -subst or -remove, you'
+ 'must give an even number of template names.')
return
+ templates = {key: value
+ for key, value in pairwise(template_names)}
+
old_templates = [pywikibot.Page(site, template_name, ns=10)
for template_name in templates]
--
To view, visit https://gerrit.wikimedia.org/r/c/pywikibot/core/+/820746
To unsubscribe, or for help writing mail filters, visit https://gerrit.wikimedia.org/r/settings
Gerrit-Project: pywikibot/core
Gerrit-Branch: master
Gerrit-Change-Id: I3fde479978960a7b033719ac5c5a26185c5cdd43
Gerrit-Change-Number: 820746
Gerrit-PatchSet: 6
Gerrit-Owner: Xqt <info(a)gno.de>
Gerrit-Reviewer: Mpaa <mpaa.wiki(a)gmail.com>
Gerrit-Reviewer: jenkins-bot
Gerrit-MessageType: merged
Xqt has submitted this change. ( https://gerrit.wikimedia.org/r/c/pywikibot/core/+/820837 )
Change subject: [IMPR]: fix docstring for page.ocr() method.
......................................................................
[IMPR]: fix docstring for page.ocr() method.
Fix docstring for page.ocr() method:
- specify also 'wmfOCR' for ocr_tool parameter.
Change-Id: I4829d80d9ec79354df97c1a2e95dab0f4939ecc5
---
M pywikibot/proofreadpage.py
1 file changed, 2 insertions(+), 1 deletion(-)
Approvals:
Xqt: Verified; Looks good to me, approved
Mpaa: Looks good to me, approved
diff --git a/pywikibot/proofreadpage.py b/pywikibot/proofreadpage.py
index 2f98b6c..6004a73 100644
--- a/pywikibot/proofreadpage.py
+++ b/pywikibot/proofreadpage.py
@@ -736,7 +736,8 @@
It is the user's responsibility to reset quality level accordingly.
- :param ocr_tool: 'phetools' or 'googleOCR', default is 'phetools'
+ :param ocr_tool: 'phetools', 'wmfOCR' or 'googleOCR';
+ default is 'phetools'
:return: OCR text for the page.
--
To view, visit https://gerrit.wikimedia.org/r/c/pywikibot/core/+/820837
To unsubscribe, or for help writing mail filters, visit https://gerrit.wikimedia.org/r/settings
Gerrit-Project: pywikibot/core
Gerrit-Branch: master
Gerrit-Change-Id: I4829d80d9ec79354df97c1a2e95dab0f4939ecc5
Gerrit-Change-Number: 820837
Gerrit-PatchSet: 1
Gerrit-Owner: Mpaa <mpaa.wiki(a)gmail.com>
Gerrit-Reviewer: Mpaa <mpaa.wiki(a)gmail.com>
Gerrit-Reviewer: Xqt <info(a)gno.de>
Gerrit-Reviewer: jenkins-bot
Gerrit-MessageType: merged