jenkins-bot has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/227073 )
Change subject: templatesWithParams: cache and standardise params
......................................................................
templatesWithParams: cache and standardise params
extract_templates_and_params has two implementations
(regex and mwparserfromhell) which have had different
settings for handling of unnecessary whitespace and
disabled wikitext.
Page.templatesWithParams previously used either
implementation, as-is, and therefore its returned value
varied depending on the implementation in use.
Combined with 84bd04258, this change ensures that
Page.templatesWithParams uses a consistent approach
by always removing unnecessary whitespace and disabled
wikitext, irrespective of the textlib implementation used.
This changeset means that the Page.templatesWithParams
returned values can different slightly to the result it
previously returned.
The Page.templatesWithParams results are now also cached.
Bug: T113892
Change-Id: Id36011c93af673d07cb6169a7b43b562b985a102
---
M pywikibot/page.py
1 file changed, 33 insertions(+), 3 deletions(-)
Approvals:
jenkins-bot: Verified
Xqt: Looks good to me, approved
diff --git a/pywikibot/page.py b/pywikibot/page.py
index 0f30b58..98954ce 100644
--- a/pywikibot/page.py
+++ b/pywikibot/page.py
@@ -637,6 +637,8 @@
@param value: basestring
"""
self._text = None if value is None else unicode(value)
+ if hasattr(self, '_raw_extracted_templates'):
+ del self._raw_extracted_templates
@text.deleter
def text(self):
@@ -645,6 +647,8 @@
del self._text
if hasattr(self, '_expanded_text'):
del self._expanded_text
+ if hasattr(self, '_raw_extracted_templates'):
+ del self._raw_extracted_templates
def preloadText(self):
"""
@@ -2220,20 +2224,46 @@
'if source is a Site.')
super(Page, self).__init__(source, title, ns)
+ @property
+ def raw_extracted_templates(self):
+ """
+ Extract templates using L{textlib.extract_templates_and_params}.
+
+ Disabled parts and whitespace are stripped, except for
+ whitespace in anonymous positional arguments.
+
+ This value is cached.
+
+ @rtype: list of (str, OrderedDict)
+ """
+ if not hasattr(self, '_raw_extracted_templates'):
+ templates = textlib.extract_templates_and_params(
+ self.text, True, True)
+ self._raw_extracted_templates = templates
+
+ return self._raw_extracted_templates
+
@deprecate_arg("get_redirect", None)
def templatesWithParams(self):
"""
Return templates used on this Page.
- @return: a list that contains a tuple for each use of a template
+ The templates are extracted by L{textlib.extract_templates_and_params},
+ with positional arguments placed first in order, and each named
+ argument appearing as 'name=value'.
+
+ All parameter keys and values for each template are stripped of
+ whitespace.
+
+ @return: a list of tuples with one tuple for each template invocation
in the page, with the template Page as the first entry and a list of
parameters as the second entry.
- @rtype: list
+ @rtype: list of (Page, list)
"""
# WARNING: may not return all templates used in particularly
# intricate cases such as template substitution
titles = [t.title() for t in self.templates()]
- templates = textlib.extract_templates_and_params(self.text)
+ templates = self.raw_extracted_templates
# backwards-compatibility: convert the dict returned as the second
# element into a list in the format used by old scripts
result = []
--
To view, visit https://gerrit.wikimedia.org/r/227073
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings
Gerrit-MessageType: merged
Gerrit-Change-Id: Id36011c93af673d07cb6169a7b43b562b985a102
Gerrit-PatchSet: 7
Gerrit-Project: pywikibot/core
Gerrit-Branch: master
Gerrit-Owner: John Vandenberg <jayvdb(a)gmail.com>
Gerrit-Reviewer: John Vandenberg <jayvdb(a)gmail.com>
Gerrit-Reviewer: Magul <tomasz.magulski(a)gmail.com>
Gerrit-Reviewer: Matěj Suchánek <matejsuchanek97(a)gmail.com>
Gerrit-Reviewer: Merlijn van Deen <valhallasw(a)arctus.nl>
Gerrit-Reviewer: XZise <CommodoreFabianus(a)gmx.de>
Gerrit-Reviewer: Xqt <info(a)gno.de>
Gerrit-Reviewer: jenkins-bot <>
jenkins-bot has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/329050 )
Change subject: Rewrite claimit.py
......................................................................
Rewrite claimit.py
Fix -exists argument, replace exception with an error message, use
Claim.target_equals().
There's already a patch at I4c1c0b8b7 which has been stale for two
years, so I went ahead and tried not to make greater changes.
Bug: T69284
Change-Id: I1c3d13d51ca9f409173f046e5ac8ec4604b34917
---
M scripts/claimit.py
1 file changed, 60 insertions(+), 71 deletions(-)
Approvals:
jenkins-bot: Verified
Xqt: Looks good to me, approved
diff --git a/scripts/claimit.py b/scripts/claimit.py
index 19142df..e85a941 100755
--- a/scripts/claimit.py
+++ b/scripts/claimit.py
@@ -35,12 +35,12 @@
Suppose the claim you want to add has the same property as an existing claim
and the "-exists:p" argument is used. Now, claimit.py will not add the claim
-if it has the same target, sources, and/or qualifiers as the existing claim.
+if it has the same target, source, and/or the existing claim has qualifiers.
To override this behavior, add 't' (target), 's' (sources), or 'q' (qualifiers)
to the 'exists' argument.
For instance, to add the claim to each page even if one with the same
-property, target, and qualifiers already exists:
+property and target and some qualifiers already exists:
python pwb.py claimit [pagegenerators] P246 "string example" -exists:ptq
@@ -50,7 +50,7 @@
"""
#
# (C) Legoktm, 2013
-# (C) Pywikibot team, 2013-2014
+# (C) Pywikibot team, 2013-2017
#
# Distributed under the terms of the MIT license.
#
@@ -87,80 +87,67 @@
super(ClaimRobot, self).__init__(use_from_page=None)
self.generator = generator
self.claims = claims
- self.exists_arg = exists_arg
+ self.exists_arg = ''.join(x for x in exists_arg.lower() if x in 'pqst')
self.cacheSources()
if self.exists_arg:
- pywikibot.output('\'exists\' argument set to \'%s\'' % self.exists_arg)
+ pywikibot.output("'exists' argument set to '%s'" % self.exists_arg)
def treat(self, page, item):
"""Treat each page."""
self.current_page = page
+ # The generator might yield pages from multiple sites
+ source = self.getSource(page.site)
- if item:
- for claim in self.claims:
- skip = False
+ for claim in self.claims:
+ # Existing claims on page of same property
+ for existing in item.claims.get(claim.getID(), []):
# If claim with same property already exists...
- if claim.getID() in item.claims:
- if self.exists_arg is None or 'p' not in self.exists_arg:
- pywikibot.log(
- 'Skipping %s because claim with same property '
- 'already exists' % (claim.getID(),))
- pywikibot.log(
- 'Use -exists:p option to override this behavior')
- skip = True
- else:
- # Existing claims on page of same property
- existing_claims = item.claims[claim.getID()]
- for existing in existing_claims:
- skip = True # Default value
- # If some attribute of the claim being added
- # matches some attribute in an existing claim of
- # the same property, skip the claim, unless the
- # 'exists' argument overrides it.
- if (claim.getTarget() == existing.getTarget() and
- 't' not in self.exists_arg):
- pywikibot.log(
- 'Skipping %s because claim with same target already exists'
- % (claim.getID(),))
- pywikibot.log(
- 'Append \'t\' to -exists argument to override this behavior')
- break
- if (listsEqual(claim.getSources(), existing.getSources()) and
- 's' not in self.exists_arg):
- pywikibot.log(
- 'Skipping %s because claim with same sources already exists'
- % (claim.getID(),))
- pywikibot.log(
- 'Append \'s\' to -exists argument to override this behavior')
- break
- if (listsEqual(claim.qualifiers, existing.qualifiers) and
- 'q' not in self.exists_arg):
- pywikibot.log(
- 'Skipping %s because claim with same '
- 'qualifiers already exists' % (claim.getID(),))
- pywikibot.log(
- 'Append \'q\' to -exists argument to override this behavior')
- break
- skip = False
- if not skip:
- # A generator might yield pages from multiple languages
- self.user_add_claim(item, claim, page.site)
-
-
-def listsEqual(list1, list2):
- """
- Return true if the lists are probably equal, ignoring order.
-
- Works for lists of unhashable items (like dictionaries).
- """
- if len(list1) != len(list2):
- return False
- if sorted(list1) != sorted(list2):
- return False
- for item in list1:
- if item not in list2:
- return False
- return True
+ if 'p' not in self.exists_arg:
+ pywikibot.log(
+ 'Skipping %s because claim with same property already exists'
+ % (claim.getID(),))
+ pywikibot.log(
+ 'Use -exists:p option to override this behavior')
+ break
+ # If some attribute of the claim being added
+ # matches some attribute in an existing claim of
+ # the same property, skip the claim, unless the
+ # 'exists' argument overrides it.
+ if (existing.target_equals(claim.getTarget()) and
+ 't' not in self.exists_arg):
+ pywikibot.log(
+ 'Skipping %s because claim with same target already exists'
+ % (claim.getID(),))
+ pywikibot.log(
+ "Append 't' to -exists argument to override this behavior")
+ break
+ if 'q' not in self.exists_arg and not existing.qualifiers:
+ pywikibot.log(
+ 'Skipping %s because claim without qualifiers already exists'
+ % (claim.getID(),))
+ pywikibot.log(
+ "Append 'q' to -exists argument to override this behavior")
+ break
+ if ('s' not in self.exists_arg or not source) and not existing.sources:
+ pywikibot.log(
+ 'Skipping %s because claim without source already exists'
+ % (claim.getID(),))
+ pywikibot.log(
+ "Append 's' to -exists argument to override this behavior")
+ break
+ if ('s' not in self.exists_arg and source and
+ any(source.getID() in ref and
+ all(snak.target_equals(source.getTarget())
+ for snak in ref[source.getID()])
+ for ref in existing.sources)):
+ pywikibot.log(
+ 'Skipping %s because claim with the same source already exists'
+ % (claim.getID(),))
+ pywikibot.log(
+ "Append 's' to -exists argument to override this behavior")
+ break
+ else:
+ self.user_add_claim(item, claim, page.site)
def main(*args):
@@ -171,6 +158,7 @@
@param args: command line arguments
@type args: list of unicode
+ @rtype: bool
"""
exists_arg = ''
commandline_claims = list()
@@ -182,14 +170,15 @@
for arg in local_args:
# Handle args specifying how to handle duplicate claims
if arg.startswith('-exists:'):
- exists_arg = arg.split(':')[1].strip('"')
+ exists_arg = arg.split(':')[1]
continue
# Handle page generator args
if gen.handleArg(arg):
continue
commandline_claims.append(arg)
if len(commandline_claims) % 2:
- raise ValueError # or something.
+ pywikibot.error('Incomplete command line property-value pair.')
+ return False
claims = list()
repo = pywikibot.Site().data_repository()
--
To view, visit https://gerrit.wikimedia.org/r/329050
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings
Gerrit-MessageType: merged
Gerrit-Change-Id: I1c3d13d51ca9f409173f046e5ac8ec4604b34917
Gerrit-PatchSet: 5
Gerrit-Project: pywikibot/core
Gerrit-Branch: master
Gerrit-Owner: Matěj Suchánek <matejsuchanek97(a)gmail.com>
Gerrit-Reviewer: Dalba <dalba.wiki(a)gmail.com>
Gerrit-Reviewer: John Vandenberg <jayvdb(a)gmail.com>
Gerrit-Reviewer: Legoktm <legoktm(a)member.fsf.org>
Gerrit-Reviewer: Magul <tomasz.magulski(a)gmail.com>
Gerrit-Reviewer: Matěj Suchánek <matejsuchanek97(a)gmail.com>
Gerrit-Reviewer: Xqt <info(a)gno.de>
Gerrit-Reviewer: jenkins-bot <>
jenkins-bot has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/351101 )
Change subject: tools_tests.py: Take care of windows line endings
......................................................................
tools_tests.py: Take care of windows line endings
Depending on git configuration, line-ending of checked-out files
may be "\r\n" instead of "\n" for windows users. This causes the
hash values of files to change and tests will fail. Add hash
values for windows line-endings.
Replace b'\r\n' with b'\n' in the content of uncompressed files.
Change-Id: I5babeb7b3ea5d1800449e37a8a2b2c358eec663a
---
M tests/tools_tests.py
1 file changed, 34 insertions(+), 13 deletions(-)
Approvals:
jenkins-bot: Verified
Xqt: Looks good to me, approved
diff --git a/tests/tools_tests.py b/tests/tools_tests.py
index abf00b8..c0e916c 100644
--- a/tests/tools_tests.py
+++ b/tests/tools_tests.py
@@ -98,12 +98,12 @@
super(OpenArchiveTestCase, cls).setUpClass()
cls.base_file = join_xml_data_path('article-pyrus.xml')
with open(cls.base_file, 'rb') as f:
- cls.original_content = f.read()
+ cls.original_content = f.read().replace(b'\r\n', b'\n')
def _get_content(self, *args, **kwargs):
"""Use open_archive and return content using a with-statement."""
with tools.open_archive(*args, **kwargs) as f:
- return f.read()
+ return f.read().replace(b'\r\n', b'\n')
def test_open_archive_normal(self):
"""Test open_archive with no compression in the standard library."""
@@ -176,7 +176,7 @@
kwargs['use_extension'] = True
with tools.open_compressed(*args, **kwargs) as f:
- content = f.read()
+ content = f.read().replace(b'\r\n', b'\n')
self.assertOneDeprecation(self.INSTEAD)
return content
@@ -193,7 +193,7 @@
super(OpenArchiveWriteTestCase, cls).setUpClass()
cls.base_file = join_xml_data_path('article-pyrus.xml')
with open(cls.base_file, 'rb') as f:
- cls.original_content = f.read()
+ cls.original_content = f.read().replace(b'\r\n', b'\n')
def _write_content(self, suffix):
try:
@@ -754,7 +754,12 @@
class TestFileShaCalculator(TestCase):
- """Test calculator of sha of a file."""
+ r"""Test calculator of sha of a file.
+
+ There are two possible hash values for each test. The second one is for
+ files with windows line endings (\r\n).
+
+ """
net = False
@@ -767,37 +772,53 @@
def test_md5_complete_calculation(self):
"""Test md5 of complete file."""
res = tools.compute_file_hash(self.filename, sha='md5')
- self.assertEqual(res, '5d7265e290e6733e1e2020630262a6f3')
+ self.assertIn(res, (
+ '5d7265e290e6733e1e2020630262a6f3',
+ '2c941f2fa7e6e629d165708eb02b67f7',
+ ))
def test_md5_partial_calculation(self):
"""Test md5 of partial file (1024 bytes)."""
res = tools.compute_file_hash(self.filename, sha='md5',
bytes_to_read=1024)
- self.assertEqual(res, 'edf6e1accead082b6b831a0a600704bc')
+ self.assertIn(res, (
+ 'edf6e1accead082b6b831a0a600704bc',
+ 'be0227b6d490baa49e6d7e131c7f596b',
+ ))
def test_sha1_complete_calculation(self):
"""Test sha1 of complete file."""
res = tools.compute_file_hash(self.filename, sha='sha1')
- self.assertEqual(res, '1c12696e1119493a625aa818a35c41916ce32d0c')
+ self.assertIn(res, (
+ '1c12696e1119493a625aa818a35c41916ce32d0c',
+ '146121e6d0461916c9a0fab00dc718acdb6a6b14',
+ ))
def test_sha1_partial_calculation(self):
"""Test sha1 of partial file (1024 bytes)."""
res = tools.compute_file_hash(self.filename, sha='sha1',
bytes_to_read=1024)
- self.assertEqual(res, 'e56fa7bd5cfdf6bb7e2d8649dd9216c03e7271e6')
+ self.assertIn(res, (
+ 'e56fa7bd5cfdf6bb7e2d8649dd9216c03e7271e6',
+ '617ce7d539848885b52355ed597a042dae1e726f',
+ ))
def test_sha224_complete_calculation(self):
"""Test sha224 of complete file."""
res = tools.compute_file_hash(self.filename, sha='sha224')
- self.assertEqual(
- res, '3d350d9d9eca074bd299cb5ffe1b325a9f589b2bcd7ba1c033ab4d33')
+ self.assertIn(res, (
+ '3d350d9d9eca074bd299cb5ffe1b325a9f589b2bcd7ba1c033ab4d33',
+ '4a2cf33b7da01f7b0530b2cc624e1180c8651b20198e9387aee0c767',
+ ))
def test_sha224_partial_calculation(self):
"""Test sha224 of partial file (1024 bytes)."""
res = tools.compute_file_hash(self.filename, sha='sha224',
bytes_to_read=1024)
- self.assertEqual(
- res, 'affa8cb79656a9b6244a079f8af91c9271e382aa9d5aa412b599e169')
+ self.assertIn(res, (
+ 'affa8cb79656a9b6244a079f8af91c9271e382aa9d5aa412b599e169',
+ '486467144e683aefd420d576250c4cc984e6d7bf10c85d36e3d249d2',
+ ))
class Foo(object):
--
To view, visit https://gerrit.wikimedia.org/r/351101
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings
Gerrit-MessageType: merged
Gerrit-Change-Id: I5babeb7b3ea5d1800449e37a8a2b2c358eec663a
Gerrit-PatchSet: 2
Gerrit-Project: pywikibot/core
Gerrit-Branch: master
Gerrit-Owner: Dalba <dalba.wiki(a)gmail.com>
Gerrit-Reviewer: Magul <tomasz.magulski(a)gmail.com>
Gerrit-Reviewer: Xqt <info(a)gno.de>
Gerrit-Reviewer: jenkins-bot <>
jenkins-bot has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/351009 )
Change subject: Remove D211, D102, E241, and E731 from flake8 ignore codes
......................................................................
Remove D211, D102, E241, and E731 from flake8 ignore codes
Those parts that should be ignored are already ignored using putty-ignore.
Change-Id: I1cf85c762af1c3cb1f8b514e9cec26689b0c5bff
---
M tox.ini
1 file changed, 1 insertion(+), 5 deletions(-)
Approvals:
jenkins-bot: Verified
Xqt: Looks good to me, approved
diff --git a/tox.ini b/tox.ini
index afff122..d12e021 100644
--- a/tox.ini
+++ b/tox.ini
@@ -139,15 +139,11 @@
# P102,P103: string does contain unindexed parameters; see I36355923
# Errors occured after upgrade to pydocstyle 2.0.0 (T164142)
-# D211: No blank lines allowed before class docstring
-# D102: Missing docstring in public method
# D401: First line should be in imperative mood; try rephrasing
-# E241: multiple spaces after ':'
# D413: Missing blank line after last section
# D412: No blank lines allowed between a section header and its content
-# E731: do not assign a lambda expression, use a def
-ignore = C401,C402,C405,E402,D105,D211,FI10,FI12,FI13,FI15,FI16,FI17,FI5,H101,H201,H236,H301,H404,H405,I100,I101,N802,N803,N806,D211,D102,D401,E241,D413,D103,D412,E731
+ignore = C401,C402,C405,E402,D105,D211,FI10,FI12,FI13,FI15,FI16,FI17,FI5,H101,H201,H236,H301,H404,H405,I100,I101,N802,N803,N806,D401,D413,D103,D412
exclude = .tox,.git,./*.egg,ez_setup.py,build,externals,user-config.py,./scripts/i18n/*
min-version = 2.6
max_line_length = 100
--
To view, visit https://gerrit.wikimedia.org/r/351009
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings
Gerrit-MessageType: merged
Gerrit-Change-Id: I1cf85c762af1c3cb1f8b514e9cec26689b0c5bff
Gerrit-PatchSet: 1
Gerrit-Project: pywikibot/core
Gerrit-Branch: master
Gerrit-Owner: Dalba <dalba.wiki(a)gmail.com>
Gerrit-Reviewer: Dalba <dalba.wiki(a)gmail.com>
Gerrit-Reviewer: Xqt <info(a)gno.de>
Gerrit-Reviewer: jenkins-bot <>