jenkins-bot has submitted this change. ( https://gerrit.wikimedia.org/r/c/pywikibot/core/+/926726 )
Change subject: [doc] Show a warning for FilePage.download()
......................................................................
[doc] Show a warning for FilePage.download()
FilePage.download() overrides a given file without further notes.
Change-Id: Ida25a2f6f5f73d94d4cd996d406cc946593e7bdb
---
M pywikibot/page/_filepage.py
1 file changed, 13 insertions(+), 0 deletions(-)
Approvals:
Xqt: Looks good to me, approved
jenkins-bot: Verified
diff --git a/pywikibot/page/_filepage.py b/pywikibot/page/_filepage.py
index 79f59a0..a6c6473 100644
--- a/pywikibot/page/_filepage.py
+++ b/pywikibot/page/_filepage.py
@@ -312,6 +312,8 @@
iterable of path segments.
.. note:: filename suffix is adjusted if target url's suffix is
different which may be the case if a thumbnail is loaded.
+ .. warning:: If a file already exists, it will be overridden
+ without further notes.
.. seealso:: :api:`Imageinfo` for new parameters
:param filename: filename where to save file. If ``None``,
--
To view, visit https://gerrit.wikimedia.org/r/c/pywikibot/core/+/926726
To unsubscribe, or for help writing mail filters, visit https://gerrit.wikimedia.org/r/settings
Gerrit-Project: pywikibot/core
Gerrit-Branch: master
Gerrit-Change-Id: Ida25a2f6f5f73d94d4cd996d406cc946593e7bdb
Gerrit-Change-Number: 926726
Gerrit-PatchSet: 1
Gerrit-Owner: Xqt <info(a)gno.de>
Gerrit-Reviewer: Xqt <info(a)gno.de>
Gerrit-Reviewer: jenkins-bot
Gerrit-MessageType: merged
jenkins-bot has submitted this change. ( https://gerrit.wikimedia.org/r/c/pywikibot/core/+/926657 )
Change subject: [IMPR] Print an error message if a fixes entry is not a dict
......................................................................
[IMPR] Print an error message if a fixes entry is not a dict
If a fixes entry is different from a dict, throw an error message
and print the fix entry to logfile if -debug option is given.
https://www.mediawiki.org/wiki/Topic:Xizzw3qpwgt24d62
Change-Id: I48134579d92d7dd5a5a9d329ca81b2e1191403b1
---
M scripts/replace.py
1 file changed, 24 insertions(+), 1 deletion(-)
Approvals:
Matěj Suchánek: Looks good to me, approved
jenkins-bot: Verified
diff --git a/scripts/replace.py b/scripts/replace.py
index feb6541..4cce7ab 100755
--- a/scripts/replace.py
+++ b/scripts/replace.py
@@ -883,7 +883,7 @@
return pagegenerators.MySQLPageGenerator(sql)
-def main(*args: str) -> None:
+def main(*args: str) -> None: # noqa: C901
"""
Process command line arguments and invoke bot.
@@ -1000,6 +1000,15 @@
pywikibot.info(f'The user fixes file could not be found: '
f'{fixes.filename}')
return
+
+ if not isinstance(fix, dict):
+ pywikibot.error(
+ f'fixes[{fix_name!r}] is a {type(fix).__name__}, not a dict')
+ if type(fix) is tuple:
+ pywikibot.info('Maybe a trailing comma in your user_fixes.py?')
+ pywikibot.debug(fix)
+ return
+
if not fix['replacements']:
pywikibot.warning(f'No replacements defined for fix {fix_name!r}')
continue
--
To view, visit https://gerrit.wikimedia.org/r/c/pywikibot/core/+/926657
To unsubscribe, or for help writing mail filters, visit https://gerrit.wikimedia.org/r/settings
Gerrit-Project: pywikibot/core
Gerrit-Branch: master
Gerrit-Change-Id: I48134579d92d7dd5a5a9d329ca81b2e1191403b1
Gerrit-Change-Number: 926657
Gerrit-PatchSet: 3
Gerrit-Owner: Xqt <info(a)gno.de>
Gerrit-Reviewer: D3r1ck01 <xsavitar.wiki(a)aol.com>
Gerrit-Reviewer: Matěj Suchánek <matejsuchanek97(a)gmail.com>
Gerrit-Reviewer: jenkins-bot
Gerrit-MessageType: merged
jenkins-bot has submitted this change. ( https://gerrit.wikimedia.org/r/c/pywikibot/core/+/926673 )
Change subject: [doc] update ROADMAP.rst and CHANGELOG.rst
......................................................................
[doc] update ROADMAP.rst and CHANGELOG.rst
Change-Id: I421e0918d2a0e1a03e0187743e77a41c6d753379
---
M ROADMAP.rst
M scripts/CHANGELOG.rst
2 files changed, 21 insertions(+), 2 deletions(-)
Approvals:
Xqt: Looks good to me, approved
jenkins-bot: Verified
diff --git a/ROADMAP.rst b/ROADMAP.rst
index ef0878b..ce37c11 100644
--- a/ROADMAP.rst
+++ b/ROADMAP.rst
@@ -1,7 +1,12 @@
Current release
---------------
-* return 'https' scheme with :meth:`family.Family.protocol` (:phab:`T326046`)
+* Enable :meth:`FilePage.download()<pywikibot.FilePage.download>` to download thumbnails (:phab:`T247095`)
+* Refactor :func:`tools.compute_file_hash` and use ``hashlib.file_digest`` with Python 3.11
+* Url ends with curly bracket in :func:`textlib.compileLinkR` (:phab:`T338029`)
+* Allows spaces in environment variables for :class:`editor.TextEditor` (:phab:`T102465`, :phab:`T323078`)
+* Add :func:`textlib.get_regexes` puplic function (:phab:`T336144`)
+* Return 'https' scheme with :meth:`family.Family.protocol` (:phab:`T326046`)
* Use ``build`` instead of ``setuptools.setup()`` to build the distribution
* Raise ``ConnectionError`` on ``requests.ReadTimeout`` in :func:`comms.http.error_handling_callback`
* Raise :exc:`exceptions.ServerError` on ``requests.ReadTimeout`` in :func:`comms.http.error_handling_callback`
@@ -46,7 +51,7 @@
* 7.2.0: RedirectPageBot and NoRedirectPageBot bot classes are deprecated in favour of
:attr:`use_redirects<bot.BaseBot.use_redirects>` attribute
* 7.2.0: :func:`tools.formatter.color_format<tools.formatter.color_format>` is deprecated and will be removed
-* 7.1.0: Unused `get_redirect` parameter of Page.getOldVersion() will be removed
+* 7.1.0: Unused ``get_redirect`` parameter of :meth:`Page.getOldVersion()<page.BasePage.getOldVersion>` will be removed
* 7.1.0: APISite._simple_request() will be removed in favour of APISite.simple_request()
* 7.0.0: User.isBlocked() method is renamed to is_blocked for consistency
* 7.0.0: Private BaseBot counters _treat_counter, _save_counter, _skip_counter will be removed in favour of collections.Counter counter attribute
diff --git a/scripts/CHANGELOG.rst b/scripts/CHANGELOG.rst
index d4f4f2c..12796d4 100644
--- a/scripts/CHANGELOG.rst
+++ b/scripts/CHANGELOG.rst
@@ -9,6 +9,11 @@
* KeyboardInterrupt was enabled for -async option
+listpages
+~~~~~~~~~
+
+* ``-tofile`` option was added to save list to a file
+
noreferences
~~~~~~~~~~~~
--
To view, visit https://gerrit.wikimedia.org/r/c/pywikibot/core/+/926673
To unsubscribe, or for help writing mail filters, visit https://gerrit.wikimedia.org/r/settings
Gerrit-Project: pywikibot/core
Gerrit-Branch: master
Gerrit-Change-Id: I421e0918d2a0e1a03e0187743e77a41c6d753379
Gerrit-Change-Number: 926673
Gerrit-PatchSet: 1
Gerrit-Owner: Xqt <info(a)gno.de>
Gerrit-Reviewer: D3r1ck01 <xsavitar.wiki(a)aol.com>
Gerrit-Reviewer: Xqt <info(a)gno.de>
Gerrit-Reviewer: jenkins-bot
Gerrit-MessageType: merged
jenkins-bot has submitted this change. ( https://gerrit.wikimedia.org/r/c/pywikibot/core/+/923761 )
Change subject: [IMPR] Listpages.py: save list to a file
......................................................................
[IMPR] Listpages.py: save list to a file
Adding a new option to listpages.py to save the list of pages to a file,
instead of just printing it to the console or uploading it to the wiki.
The format options can be used to specify the format of the output.
Change-Id: I1cc834f5ec4b2132ff6295dd72475478c7692c46
---
M scripts/listpages.py
1 file changed, 24 insertions(+), 2 deletions(-)
Approvals:
Xqt: Looks good to me, approved
jenkins-bot: Verified
diff --git a/scripts/listpages.py b/scripts/listpages.py
index 7b225e6..e4298d7 100755
--- a/scripts/listpages.py
+++ b/scripts/listpages.py
@@ -47,6 +47,9 @@
-get Page content is printed.
+-tofile Save Page titles to a single file. File name can be set
+ with -tofile:filename or -tofile:dir_name/filename.
+
-save Save Page content to a file named as page.title(as_filename=True).
Directory can be set with -save:dir_name
If no dir is specified, current directory will be used.
@@ -176,6 +179,7 @@
available_options = {
'always': True,
'save': None,
+ 'tofile': None,
'encode': config.textfile_encoding,
'format': '1',
'notitle': False,
@@ -190,7 +194,7 @@
def treat(self, page) -> None:
"""Process one page and add it to the `output_list`."""
self.num += 1
- if not self.opt.notitle:
+ if self.opt.tofile or not self.opt.notitle:
page_fmt = Formatter(page, self.opt.outputlang)
self.output_list += [page_fmt.output(num=self.num,
fmt=self.opt.format)]
@@ -241,12 +245,17 @@
self.opt.save = base_dir
def teardown(self) -> None:
- """Print the list and put it to the target page if specified."""
+ """Print list, if selected put it to wiki page or save it to a file."""
text = '\n'.join(self.output_list)
if self.opt.put:
self.current_page = self.opt.put
self.put_current(text, summary=self.opt.summary, show_diff=False)
+ if self.opt.tofile:
+ pywikibot.info(f'Writing page titles to {self.opt.tofile}')
+ with open(self.opt.tofile, 'w', encoding='utf-8') as f:
+ f.write(text)
+
if self.opt.preloading is True:
pywikibot.stdout(text)
--
To view, visit https://gerrit.wikimedia.org/r/c/pywikibot/core/+/923761
To unsubscribe, or for help writing mail filters, visit https://gerrit.wikimedia.org/r/settings
Gerrit-Project: pywikibot/core
Gerrit-Branch: master
Gerrit-Change-Id: I1cc834f5ec4b2132ff6295dd72475478c7692c46
Gerrit-Change-Number: 923761
Gerrit-PatchSet: 4
Gerrit-Owner: Dr03ramos <dr03ramos(a)gmail.com>
Gerrit-Reviewer: D3r1ck01 <xsavitar.wiki(a)aol.com>
Gerrit-Reviewer: Matěj Suchánek <matejsuchanek97(a)gmail.com>
Gerrit-Reviewer: Xqt <info(a)gno.de>
Gerrit-Reviewer: jenkins-bot
Gerrit-CC: Welcome, new contributor! <ssethi(a)wikimedia.org>
Gerrit-MessageType: merged
jenkins-bot has submitted this change. ( https://gerrit.wikimedia.org/r/c/pywikibot/core/+/924058 )
Change subject: [IMPR] Enable FilePage.download() to download thumbnails
......................................................................
[IMPR] Enable FilePage.download() to download thumbnails
- add *url_width*, *url_height* and *url_param* to FilePage.download
- allow filename to be a PathLike object or an iterable of path sections
- use title if a user path sepcifier is used only
- adjust path suffix if required
- remove try clause for OSError exception because the exception is
raised without further statements
- update documentation
- add tests
Bug: T247095
Change-Id: I21f2bb9a15681540044b18fcd71d54061bc12913
---
M pywikibot/page/_filepage.py
M pywikibot/site/_apisite.py
M tests/file_tests.py
3 files changed, 177 insertions(+), 58 deletions(-)
Approvals:
Xqt: Looks good to me, approved
jenkins-bot: Verified
diff --git a/pywikibot/page/_filepage.py b/pywikibot/page/_filepage.py
index 1916925..79f59a0 100644
--- a/pywikibot/page/_filepage.py
+++ b/pywikibot/page/_filepage.py
@@ -10,10 +10,14 @@
#
# Distributed under the terms of the MIT license.
#
-import os.path
from http import HTTPStatus
+from os import PathLike
+from pathlib import Path
+from typing import Optional, Union
+from urllib.parse import urlparse
import pywikibot
+from pywikibot.backports import Iterable
from pywikibot.comms import http
from pywikibot.exceptions import NoPageError
from pywikibot.page._page import Page
@@ -107,25 +111,31 @@
self._imagePageHtml = http.request(self.site, path).text
return self._imagePageHtml
- def get_file_url(self, url_width=None, url_height=None,
- url_param=None) -> str:
- """
- Return the url or the thumburl of the file described on this page.
+ def get_file_url(self,
+ url_width: Optional[int] = None,
+ url_height: Optional[int] = None,
+ url_param: Optional[int] = None) -> str:
+ """Return the url or the thumburl of the file described on this page.
Fetch the information if not available.
- Once retrieved, thumburl information will also be accessible as
- latest_file_info attributes, named as in [1]:
- - url, thumburl, thumbwidth and thumbheight
+ Once retrieved, file information will also be accessible as
+ :attr:`latest_file_info` attributes, named as in :api:`Imageinfo`.
+ If *url_width*, *url_height* or *url_param* is given, additional
+ properties ``thumbwidth``, ``thumbheight``, ``thumburl`` and
+ ``responsiveUrls`` are provided.
- Parameters correspond to iiprops in:
- [1] :api:`Imageinfo`
+ .. note:: Parameters validation and error handling left to the
+ API call.
+ .. seealso::
- Parameters validation and error handling left to the API call.
+ * :meth:`APISite.loadimageinfo()
+ <pywikibot.site._apisite.APISite.loadimageinfo>`
+ * :api:`Imageinfo`
- :param url_width: see iiurlwidth in [1]
- :param url_height: see iiurlheigth in [1]
- :param url_param: see iiurlparam in [1]
+ :param url_width: get info for a thumbnail with given width
+ :param url_height: get info for a thumbnail with given height
+ :param url_param: get info for a thumbnail with given param
:return: latest file url or thumburl
"""
# Plain url is requested.
@@ -267,47 +277,95 @@
return self.site.upload(self, source_filename=filename, source_url=url,
**kwargs)
- def download(self, filename=None, chunk_size=100 * 1024, revision=None):
- """
- Download to filename file of FilePage.
+ def download(self,
+ filename: Union[None, str, PathLike, Iterable[str]] = None,
+ chunk_size: int = 100 * 1024,
+ revision: Optional['FileInfo'] = None, *,
+ url_width: Optional[int] = None,
+ url_height: Optional[int] = None,
+ url_param: Optional[int] = None) -> bool:
+ """Download to filename file of FilePage.
- :param filename: filename where to save file:
- None: self.title(as_filename=True, with_ns=False)
- will be used
- str: provided filename will be used.
- :type filename: None or str
+ **Usage examples:**
+
+ Download an image:
+
+ >>> site = pywikibot.Site('wikipedia:test')
+ >>> file = pywikibot.FilePage(site, 'Pywikibot MW gear icon.svg')
+ >>> file.download()
+ True
+
+ Pywikibot_MW_gear_icon.svg was downloaded.
+
+ Download a thumnail:
+
+ >>> file.download(url_param='120px')
+ True
+
+ The suffix has changed and Pywikibot_MW_gear_icon.png was
+ downloaded.
+
+ .. versionadded:: 8.2
+ *url_width*, *url_height* and *url_param* parameters.
+ .. versionchanged:: 8.2
+ *filename* argument may be also a path-like object or an
+ iterable of path segments.
+ .. note:: filename suffix is adjusted if target url's suffix is
+ different which may be the case if a thumbnail is loaded.
+ .. seealso:: :api:`Imageinfo` for new parameters
+
+ :param filename: filename where to save file. If ``None``,
+ ``self.title(as_filename=True, with_ns=False)`` will be used.
+ If an Iterable is specified the items will be used as path
+ segments. To specify the user directory path you have to use
+ either ``~`` or ``~user`` as first path segment e.g. ``~/foo``
+ or ``('~', 'foo')`` as filename. If only the user directory
+ specifier is given, the title is used as filename like for
+ None. If the suffix is missing or different from url (which
+ can happen if a *url_width*, *url_height* or *url_param*
+ argument is given), the file suffix is adjusted.
:param chunk_size: the size of each chunk to be received and
written to file.
- :type chunk_size: int
- :param revision: file revision to download:
- None: self.latest_file_info will be used
- FileInfo: provided revision will be used.
- :type revision: None or FileInfo
+ :param revision: file revision to download. If None
+ :attr:`latest_file_info` will be used; otherwise provided
+ revision will be used.
+ :param url_width: download thumbnail with given width
+ :param url_height: download thumbnail with given height
+ :param url_param: download thumbnail with given param
:return: True if download is successful, False otherwise.
:raise IOError: if filename cannot be written for any reason.
"""
- if filename is None:
- filename = self.title(as_filename=True, with_ns=False)
+ if not filename:
+ path = Path()
+ elif isinstance(filename, (str, PathLike)):
+ path = Path(filename)
+ else:
+ path = Path(*filename)
- filename = os.path.expanduser(filename)
+ if path.stem in ('', '~', '~user'):
+ path = path / self.title(as_filename=True, with_ns=False)
- if revision is None:
+ thumb = bool(url_width or url_height or url_param)
+ if thumb or revision is None:
+ url = self.get_file_url(url_width, url_height, url_param)
revision = self.latest_file_info
+ else:
+ url = revision.url
- req = http.fetch(revision.url, stream=True)
+ # adjust suffix
+ path = path.with_suffix(Path(urlparse(url).path).suffix)
+ # adjust user path
+ path = path.expanduser()
+ req = http.fetch(url, stream=True)
if req.status_code == HTTPStatus.OK:
- try:
- with open(filename, 'wb') as f:
- for chunk in req.iter_content(chunk_size):
- f.write(chunk)
- except OSError as e:
- raise e
+ with open(path, 'wb') as f:
+ for chunk in req.iter_content(chunk_size):
+ f.write(chunk)
- sha1 = compute_file_hash(filename)
- return sha1 == revision.sha1
+ return thumb or compute_file_hash(path) == revision.sha1
+
pywikibot.warning(
- 'Unsuccessful request ({}): {}'
- .format(req.status_code, req.url))
+ f'Unsuccessful request ({req.status_code}): {req.url}')
return False
def globalusage(self, total=None):
diff --git a/pywikibot/site/_apisite.py b/pywikibot/site/_apisite.py
index a9decae..77a9a32 100644
--- a/pywikibot/site/_apisite.py
+++ b/pywikibot/site/_apisite.py
@@ -1372,24 +1372,34 @@
) -> None:
"""Load image info from api and save in page attributes.
- Parameters correspond to iiprops in:
- [1] :api:`Imageinfo`
+ The following properties are loaded: ``timestamp``, ``user``,
+ ``comment``, ``url``, ``size``, ``sha1``, ``mime``, ``mediatype``,
+ ``metadata``, ``archivename`` and ``bitdepth``. If *url_width*,
+ *url_height* or *url_param* is given, additional properties
+ ``thumbwidth``, ``thumbheight``, ``thumburl`` and
+ ``responsiveUrls`` are given.
- Parameters validation and error handling left to the API call.
+ .. note:: Parameters validation and error handling left to the
+ API call.
+ .. versionchanged:: 8.2
+ *mediatype* and *bitdepth* properties were added.
+ .. seealso:: :api:`Imageinfo`
:param history: if true, return the image's version history
- :param url_width: see iiurlwidth in [1]
- :param url_height: see iiurlheigth in [1]
- :param url_param: see iiurlparam in [1]
-
+ :param url_width: get info for a thumbnail with given width
+ :param url_height: get info for a thumbnail with given height
+ :param url_param: get info for a thumbnail with given param
"""
- args = {'titles': page.title(with_section=False),
- 'iiurlwidth': url_width,
- 'iiurlheight': url_height,
- 'iiurlparam': url_param,
- 'iiprop': ['timestamp', 'user', 'comment', 'url', 'size',
- 'sha1', 'mime', 'metadata', 'archivename']
- }
+ args = {
+ 'titles': page.title(with_section=False),
+ 'iiurlwidth': url_width,
+ 'iiurlheight': url_height,
+ 'iiurlparam': url_param,
+ 'iiprop': [
+ 'timestamp', 'user', 'comment', 'url', 'size', 'sha1', 'mime',
+ 'mediatype', 'metadata', 'archivename', 'bitdepth',
+ ]
+ }
if not history:
args['total'] = 1
query = self._generator(api.PropertyGenerator,
diff --git a/tests/file_tests.py b/tests/file_tests.py
index 38325c8..be0fafd 100755
--- a/tests/file_tests.py
+++ b/tests/file_tests.py
@@ -280,12 +280,44 @@
cached = True
def test_successful_download(self):
- """Test successful_download."""
+ """Test successful download."""
page = pywikibot.FilePage(self.site, 'File:Albert Einstein.jpg')
filename = join_images_path('Albert Einstein.jpg')
status_code = page.download(filename)
self.assertTrue(status_code)
- os.unlink(filename)
+ oldsize = os.stat(filename).st_size
+
+ status_code = page.download(filename, url_height=128)
+ self.assertTrue(status_code)
+ size = os.stat(filename).st_size
+ self.assertLess(size, oldsize)
+
+ status_code = page.download(filename, url_width=120)
+ self.assertTrue(status_code)
+ size = os.stat(filename).st_size
+ self.assertLess(size, oldsize)
+
+ status_code = page.download(filename, url_param='120px')
+ self.assertTrue(status_code)
+ self.assertEqual(size, os.stat(filename).st_size)
+
+ os.remove(filename)
+
+ def test_changed_title(self):
+ """Test changed title."""
+ page = pywikibot.FilePage(self.site, 'Pywikibot MW gear icon.svg')
+ filename = join_images_path('Pywikibot MW gear icon.svg')
+ status_code = page.download(filename)
+ self.assertTrue(status_code)
+ self.assertTrue(os.path.exists(filename))
+
+ status_code = page.download(filename, url_param='120px')
+ self.assertTrue(status_code)
+ new_filename = filename.replace('.svg', '.png')
+ self.assertTrue(os.path.exists(new_filename))
+
+ os.remove(filename)
+ os.remove(new_filename)
def test_not_existing_download(self):
"""Test not existing download."""
--
To view, visit https://gerrit.wikimedia.org/r/c/pywikibot/core/+/924058
To unsubscribe, or for help writing mail filters, visit https://gerrit.wikimedia.org/r/settings
Gerrit-Project: pywikibot/core
Gerrit-Branch: master
Gerrit-Change-Id: I21f2bb9a15681540044b18fcd71d54061bc12913
Gerrit-Change-Number: 924058
Gerrit-PatchSet: 10
Gerrit-Owner: Xqt <info(a)gno.de>
Gerrit-Reviewer: Framawiki <framawiki(a)tools.wmflabs.org>
Gerrit-Reviewer: TheSandDoctor <majorjohn1(a)mail.com>
Gerrit-Reviewer: Xqt <info(a)gno.de>
Gerrit-Reviewer: jenkins-bot
Gerrit-MessageType: merged
jenkins-bot has submitted this change. ( https://gerrit.wikimedia.org/r/c/pywikibot/core/+/924492 )
Change subject: [IMPR] Use hashlib.file_digest with Python 3.11
......................................................................
[IMPR] Use hashlib.file_digest with Python 3.11
hashlib.file_digest was introduced with Pyton 3.11. Use this function
in tools.compute_file_hash() function if no bytes_to_read is given.
- refactor compute_file_hash
- enable a hash constructor or a callable to be used with
compute_file_hash like in hashlib.file_digest()
- update documentation
- add some tests
Change-Id: I9d58150c67123e619f15c8c502aaaaf2abe78ed8
---
M pywikibot/tools/__init__.py
M tests/tools_tests.py
2 files changed, 80 insertions(+), 38 deletions(-)
Approvals:
Xqt: Looks good to me, approved
jenkins-bot: Verified
diff --git a/pywikibot/tools/__init__.py b/pywikibot/tools/__init__.py
index 25f2395..ae9e715 100644
--- a/pywikibot/tools/__init__.py
+++ b/pywikibot/tools/__init__.py
@@ -19,7 +19,7 @@
from functools import total_ordering, wraps
from importlib import import_module
from types import TracebackType
-from typing import Any, Optional, Type
+from typing import Any, Optional, Type, Union
from warnings import catch_warnings, showwarning, warn
import pkg_resources
@@ -721,40 +721,48 @@
warn(warn_str.format(filename, st_mode - stat.S_IFREG, mode))
-def compute_file_hash(filename: str, sha: str = 'sha1', bytes_to_read=None):
+def compute_file_hash(filename: Union[str, os.PathLike],
+ sha: Union[str, Callable[[], Any]] = 'sha1',
+ bytes_to_read: Optional[int] = None) -> str:
"""Compute file hash.
Result is expressed as hexdigest().
.. versionadded:: 3.0
+ .. versionchanged:: 8.2
+ *sha* may be also a hash constructor, or a callable that returns
+ a hash object.
+
:param filename: filename path
- :param sha: hashing function among the following in hashlib:
- md5(), sha1(), sha224(), sha256(), sha384(), and sha512()
- function name shall be passed as string, e.g. 'sha1'.
- :param bytes_to_read: only the first bytes_to_read will be considered;
- if file size is smaller, the whole file will be considered.
- :type bytes_to_read: None or int
-
+ :param sha: hash algorithm available with hashlib: ``sha1()``,
+ ``sha224()``, ``sha256()``, ``sha384()``, ``sha512()``,
+ ``blake2b()``, and ``blake2s()``. Additional algorithms like
+ ``md5()``, ``sha3_224()``, ``sha3_256()``, ``sha3_384()``,
+ ``sha3_512()``, ``shake_128()`` and ``shake_256()`` may also be
+ available. *sha* must either be a hash algorithm name as a str
+ like ``'sha1'`` (default), a hash constructor like
+ ``hashlib.sha1``, or a callable that returns a hash object like
+ ``lambda: hashlib.sha1()``.
+ :param bytes_to_read: only the first bytes_to_read will be
+ considered; if file size is smaller, the whole file will be
+ considered.
"""
- size = os.path.getsize(filename)
- if bytes_to_read is None:
- bytes_to_read = size
- else:
- bytes_to_read = min(bytes_to_read, size)
- step = 1 << 20
-
- shas = ['md5', 'sha1', 'sha224', 'sha256', 'sha384', 'sha512']
- assert sha in shas
- sha = getattr(hashlib, sha)() # sha instance
-
with open(filename, 'rb') as f:
- while bytes_to_read > 0:
- read_bytes = f.read(min(bytes_to_read, step))
- assert read_bytes # make sure we actually read bytes
- bytes_to_read -= len(read_bytes)
- sha.update(read_bytes)
- return sha.hexdigest()
+ if PYTHON_VERSION < (3, 11) or bytes_to_read is not None:
+ digest = sha() if callable(sha) else hashlib.new(sha)
+ size = os.path.getsize(filename)
+ bytes_to_read = min(bytes_to_read or size, size)
+ step = 1 << 20
+ while bytes_to_read > 0:
+ read_bytes = f.read(min(bytes_to_read, step))
+ assert read_bytes # make sure we actually read bytes
+ bytes_to_read -= len(read_bytes)
+ digest.update(read_bytes)
+ else:
+ digest = hashlib.file_digest(f, sha)
+
+ return digest.hexdigest()
def cached(*arg: Callable) -> Any:
diff --git a/tests/tools_tests.py b/tests/tools_tests.py
index 2084d77..15471ac 100755
--- a/tests/tools_tests.py
+++ b/tests/tools_tests.py
@@ -5,6 +5,7 @@
#
# Distributed under the terms of the MIT license.
import decimal
+import hashlib
import os
import subprocess
import tempfile
@@ -12,6 +13,7 @@
from collections import Counter, OrderedDict
from collections.abc import Mapping
from contextlib import suppress
+from functools import partial
from unittest import mock
from pywikibot import config, tools
@@ -599,35 +601,49 @@
self.chmod.assert_called_once_with(self.file, 0o600)
+def hash_func(digest):
+ """Function who gives a hashlib function."""
+ return hashlib.new(digest)
+
+
class TestFileShaCalculator(TestCase):
r"""Test calculator of sha of a file.
There are two possible hash values for each test. The second one is for
files with Windows line endings (\r\n).
-
"""
net = False
filename = join_xml_data_path('article-pear-0.10.xml')
+ md5_tests = {
+ 'str': 'md5',
+ 'hash': hashlib.md5,
+ 'function': partial(hash_func, 'md5')
+ }
+
def test_md5_complete_calculation(self):
"""Test md5 of complete file."""
- res = tools.compute_file_hash(self.filename, sha='md5')
- self.assertIn(res, (
- '5d7265e290e6733e1e2020630262a6f3',
- '2c941f2fa7e6e629d165708eb02b67f7',
- ))
+ for test, sha in self.md5_tests.items():
+ with self.subTest(test=test):
+ res = tools.compute_file_hash(self.filename, sha=sha)
+ self.assertIn(res, (
+ '5d7265e290e6733e1e2020630262a6f3',
+ '2c941f2fa7e6e629d165708eb02b67f7',
+ ))
def test_md5_partial_calculation(self):
"""Test md5 of partial file (1024 bytes)."""
- res = tools.compute_file_hash(self.filename, sha='md5',
- bytes_to_read=1024)
- self.assertIn(res, (
- 'edf6e1accead082b6b831a0a600704bc',
- 'be0227b6d490baa49e6d7e131c7f596b',
- ))
+ for test, sha in self.md5_tests.items():
+ with self.subTest(test=test):
+ res = tools.compute_file_hash(self.filename, sha=sha,
+ bytes_to_read=1024)
+ self.assertIn(res, (
+ 'edf6e1accead082b6b831a0a600704bc',
+ 'be0227b6d490baa49e6d7e131c7f596b',
+ ))
def test_sha1_complete_calculation(self):
"""Test sha1 of complete file."""
--
To view, visit https://gerrit.wikimedia.org/r/c/pywikibot/core/+/924492
To unsubscribe, or for help writing mail filters, visit https://gerrit.wikimedia.org/r/settings
Gerrit-Project: pywikibot/core
Gerrit-Branch: master
Gerrit-Change-Id: I9d58150c67123e619f15c8c502aaaaf2abe78ed8
Gerrit-Change-Number: 924492
Gerrit-PatchSet: 5
Gerrit-Owner: Xqt <info(a)gno.de>
Gerrit-Reviewer: Xqt <info(a)gno.de>
Gerrit-Reviewer: jenkins-bot
Gerrit-MessageType: merged
jenkins-bot has submitted this change. ( https://gerrit.wikimedia.org/r/c/pywikibot/core/+/926508 )
Change subject: [bugfix] add curly brackets to notAtEnd string in textlib.compileLinkR
......................................................................
[bugfix] add curly brackets to notAtEnd string in textlib.compileLinkR
Bug: T338029
Change-Id: I780215104224c0e048f065d776a1b81be79613b2
---
M pywikibot/textlib.py
1 file changed, 11 insertions(+), 1 deletion(-)
Approvals:
Framawiki: Looks good to me, approved
jenkins-bot: Verified
diff --git a/pywikibot/textlib.py b/pywikibot/textlib.py
index 3c73b34..72e60ca 100644
--- a/pywikibot/textlib.py
+++ b/pywikibot/textlib.py
@@ -1653,7 +1653,7 @@
# Note: While allowing dots inside URLs, MediaWiki will regard
# dots at the end of the URL as not part of that URL.
# The same applies to comma, colon and some other characters.
- notAtEnd = r'\]\s\.:;,<>"\|\)'
+ notAtEnd = r'\]\s\.:;,<>"\|\)}'
# So characters inside the URL can be anything except whitespace,
# closing squared brackets, quotation marks, greater than and less
# than, and the last character also can't be parenthesis or another
--
To view, visit https://gerrit.wikimedia.org/r/c/pywikibot/core/+/926508
To unsubscribe, or for help writing mail filters, visit https://gerrit.wikimedia.org/r/settings
Gerrit-Project: pywikibot/core
Gerrit-Branch: master
Gerrit-Change-Id: I780215104224c0e048f065d776a1b81be79613b2
Gerrit-Change-Number: 926508
Gerrit-PatchSet: 1
Gerrit-Owner: Xqt <info(a)gno.de>
Gerrit-Reviewer: Framawiki <framawiki(a)tools.wmflabs.org>
Gerrit-Reviewer: jenkins-bot
Gerrit-MessageType: merged