jenkins-bot has submitted this change. ( https://gerrit.wikimedia.org/r/c/pywikibot/core/+/1130307?usp=email )
Change subject: [FEAT] Support 'page' in GeneratorsMixin.recentchanges
......................................................................
[FEAT] Support 'page' in GeneratorsMixin.recentchanges
Translates to rctitle.
Also clean up 'user' and 'excludeuser'. They have probably
never supported multiple values server-side.
Change-Id: Ie8b50a4da69c307fadc75857a418f6f8cdfb6298
---
M pywikibot/site/_generators.py
1 file changed, 8 insertions(+), 4 deletions(-)
Approvals:
Xqt: Looks good to me, approved
jenkins-bot: Verified
diff --git a/pywikibot/site/_generators.py b/pywikibot/site/_generators.py
index 92a2229..2f550aa 100644
--- a/pywikibot/site/_generators.py
+++ b/pywikibot/site/_generators.py
@@ -1469,9 +1469,10 @@
redirect: bool | None = None,
patrolled: bool | None = None,
top_only: bool = False,
+ page: str | pywikibot.Page | None = None,
total: int | None = None,
- user: str | list[str] | None = None,
- excludeuser: str | list[str] | None = None,
+ user: str | None = None,
+ excludeuser: str | None = None,
tag: str | None = None,
) -> Iterable[dict[str, Any]]:
"""Iterate recent changes.
@@ -1499,8 +1500,9 @@
only list non-patrolled edits; if None, list all
:param top_only: if True, only list changes that are the latest
revision (default False)
- :param user: if not None, only list edits by this user or users
- :param excludeuser: if not None, exclude edits by this user or users
+ :param page: if not None, only list edits to this page
+ :param user: if not None, only list edits by this user
+ :param excludeuser: if not None, exclude edits by this user
:param tag: a recent changes tag
:raises KeyError: a namespace identifier was not resolved
:raises TypeError: a namespace identifier has an inappropriate
@@ -1522,6 +1524,8 @@
rcgen.request['rcdir'] = 'newer'
if changetype:
rcgen.request['rctype'] = changetype
+ if page:
+ rcgen.request['rctitle'] = page
filters = {'minor': minor,
'bot': bot,
'anon': anon,
--
To view, visit https://gerrit.wikimedia.org/r/c/pywikibot/core/+/1130307?usp=email
To unsubscribe, or for help writing mail filters, visit https://gerrit.wikimedia.org/r/settings?usp=email
Gerrit-MessageType: merged
Gerrit-Project: pywikibot/core
Gerrit-Branch: master
Gerrit-Change-Id: Ie8b50a4da69c307fadc75857a418f6f8cdfb6298
Gerrit-Change-Number: 1130307
Gerrit-PatchSet: 1
Gerrit-Owner: Matěj Suchánek <matejsuchanek97(a)gmail.com>
Gerrit-Reviewer: Xqt <info(a)gno.de>
Gerrit-Reviewer: jenkins-bot
jenkins-bot has submitted this change. ( https://gerrit.wikimedia.org/r/c/pywikibot/core/+/1123799?usp=email )
Change subject: [bugfix] use googlesearch-python for GoogleSearchPageGenerator
......................................................................
[bugfix] use googlesearch-python for GoogleSearchPageGenerator
google package is unsupported for more than 4 years. The newest version
fails because the module name was changed. The GoogleSearchPageGenerator
fails for years due to changed protocol from http to https.
- use googlesearch-python package
- add set_maximum_items method to GoogleSearchPageGenerator to be used
with pagegenerators.GeneratorFactory
- total argument was added; the default value is 10
- queryGoogle method got **kwargs to pass additional arguments to the
googlesearch-python package
Bug: T387618
Change-Id: Ic3b594dc7b1f8510691274202050869abe915100
---
M pywikibot/pagegenerators/_generators.py
M requirements.txt
M setup.py
3 files changed, 93 insertions(+), 48 deletions(-)
Approvals:
Xqt: Looks good to me, approved
jenkins-bot: Verified
diff --git a/pywikibot/pagegenerators/_generators.py b/pywikibot/pagegenerators/_generators.py
index 4200291..2cf3c0e 100644
--- a/pywikibot/pagegenerators/_generators.py
+++ b/pywikibot/pagegenerators/_generators.py
@@ -10,7 +10,6 @@
import codecs
import io
import re
-import sys
import typing
from collections import abc
from functools import partial
@@ -872,85 +871,131 @@
yield page
-# following classes just ported from version 1 without revision; not tested
-
-
class GoogleSearchPageGenerator(GeneratorWrapper):
"""Page generator using Google search results.
- To use this generator, you need to install the package 'google':
+ To use this generator, you need to install the googlesearch package::
- :py:obj:`https://pypi.org/project/google`
-
- This package has been available since 2010, hosted on GitHub
- since 2012, and provided by PyPI since 2013.
+ pip install googlesearch-python
As there are concerns about Google's Terms of Service, this
generator prints a warning for each query.
+ .. seealso:: https://policies.google.com/terms
.. versionchanged:: 7.6
subclassed from :class:`tools.collections.GeneratorWrapper`
+ .. versionchanged:: 10.1
+ ``googlesearch-python`` package is needed instead of ``google``,
+ see :phab:`T387618` for further informations. The *total*
+ parameter was added. The *query* parameter is positional only.
+ All other parameters are keyword only.
"""
- def __init__(self, query: str | None = None,
- site: BaseSite | None = None) -> None:
+ def __init__(self, query: str = '', /, *,
+ site: BaseSite | None = None,
+ total: int = 10) -> None:
"""Initializer.
+ :param query: the text to search for.
:param site: Site for generator results.
+ :param total: the maximum number of changes to return, default
+ is 10 which is also set by googlesearch package.
"""
- self.query = query or pywikibot.input('Please enter the search query:')
- if site is None:
- site = pywikibot.Site()
- self.site = site
- self._google_query = None
+ self.query = query or pywikibot.input(
+ 'Please enter the search query:')
+ self.site = site or pywikibot.Site()
+ self.limit = total
@staticmethod
- def queryGoogle(query: str) -> Generator[str, None, None]:
- """Perform a query using python package 'google'.
+ def queryGoogle(query: str, /, **kwargs) -> Generator[str, None, None]:
+ """Perform a query using ``googlesearch-python`` package.
- The terms of service as at June 2014 give two conditions that
- may apply to use of search:
+ .. admonition:: Terms of Service
- 1. Don't access [Google Services] using a method other than
- the interface and the instructions that [they] provide.
- 2. Don't remove, obscure, or alter any legal notices
- displayed in or along with [Google] Services.
+ The terms of service as at June 2014 give two conditions that
+ may apply to use of search:
- Both of those issues should be managed by the package 'google',
- however Pywikibot will at least ensure the user sees the TOS
- in order to comply with the second condition.
+ 1. Don't access [Google Services] using a method other than
+ the interface and the instructions that [they] provide.
+ 2. Don't remove, obscure, or alter any legal notices
+ displayed in or along with [Google] Services.
+
+ Both of those issues should be managed by the
+ ``googlesearch-python`` package, however Pywikibot will at
+ least ensure the user sees the TOS in order to comply with
+ the second.
+
+ .. seealso:: https://policies.google.com/terms
+ condition.
+
+ .. important:: These note are from 2014 and have not been
+ reviewed or updated since then.
+
+ .. versionchanged:: 10.1
+ *query* is positional only; *kwargs* parameter was added.
+
+ :param query: the text to search for.
+ :param kwargs: other keyword arguments passed to ``googlesearch``
+ module.
"""
try:
- import google
- except ImportError:
- pywikibot.error('generator GoogleSearchPageGenerator '
- "depends on package 'google'.\n"
- 'To install, please run: pip install google.')
- sys.exit(1)
+ import googlesearch
+ except ModuleNotFoundError:
+ pywikibot.error("""\
+generator GoogleSearchPageGenerator depends on package
+'googlesearch-python'. To install, please run:
+
+ pip install googlesearch-python""")
+ return
+
pywikibot.warning('Please read http://www.google.com/accounts/TOS')
- yield from google.search(query)
+ yield from googlesearch.search(query, **kwargs)
@property
def generator(self) -> Generator[pywikibot.page.Page, None, None]:
"""Yield results from :meth:`queryGoogle` query.
- Google contains links in the format:
- https://de.wikipedia.org/wiki/en:Foobar
-
.. versionchanged:: 7.6
changed from iterator method to generator property
+ .. versionchanged:: 10.1
+ use :meth:`site.protocol
+ <pywikibot.site._basesite.BaseSite.protocol>` to get the base
+ URL. Also filter duplicates.
"""
+ if not self.query:
+ pywikibot.warning('No query string was specified')
+ return
+
# restrict query to local site
- local_query = f'{self.query} site:{self.site.hostname()}'
- base = f'http://{self.site.hostname()}{self.site.articlepath}'
- pattern = base.replace('{}', '(.+)')
- for url in self.queryGoogle(local_query):
- m = re.search(pattern, url)
- if m:
- page = pywikibot.Page(pywikibot.Link(m[1], self.site))
- if page.site == self.site:
- yield page
+ site = self.site
+ local_query = f'{self.query} site:{site.hostname()}'
+ base = f'{site.protocol()}://{site.hostname()}{site.articlepath}'
+ pattern = re.compile(base.replace('{}', '(?P<title>.+)'))
+
+ for url in self.queryGoogle(local_query, num_results=self.limit,
+ unique=True):
+ m = pattern.fullmatch(url)
+ if not m:
+ continue
+
+ page = pywikibot.Page(pywikibot.Link(m['title'], site))
+
+ # Google may contain links in the format:
+ # https://de.wikipedia.org/wiki/en:Foobar
+ if page.site == site:
+ yield page
+
+ def set_maximum_items(self, value: int, /):
+ """Set the maximum number of items to be retrieved from google.
+
+ This method is added to be used by the
+ :class:`pagegenerators.GeneratorFactory` to circumvent call of
+ :func:`itertools.islice` filter for this generator.
+
+ .. versionadded:: 10.1
+ """
+ self.limit = value
def MySQLPageGenerator(query: str, site: BaseSite | None = None,
diff --git a/requirements.txt b/requirements.txt
index b096bdd..c6f3ef3 100644
--- a/requirements.txt
+++ b/requirements.txt
@@ -46,7 +46,7 @@
Pillow==10.4.0; python_version < "3.9"
# core pagegenerators
-google >= 1.7
+googlesearch-python >= 1.3.0
requests-sse >= 0.5.0
# The mysql generator in pagegenerators depends on PyMySQL
diff --git a/setup.py b/setup.py
index 4e40ecb..f25a3d1 100755
--- a/setup.py
+++ b/setup.py
@@ -40,7 +40,7 @@
'eventstreams': ['requests-sse>=0.5.0'],
'isbn': ['python-stdnum>=1.20'],
'Graphviz': ['pydot>=3.0.2'],
- 'Google': ['google>=1.7'],
+ 'Google': ['googlesearch-python >= 1.3.0'],
'memento': ['memento_client==0.6.1'],
'wikitextparser': ['wikitextparser>=0.56.3'],
'mysql': ['PyMySQL >= 1.1.1'],
--
To view, visit https://gerrit.wikimedia.org/r/c/pywikibot/core/+/1123799?usp=email
To unsubscribe, or for help writing mail filters, visit https://gerrit.wikimedia.org/r/settings?usp=email
Gerrit-MessageType: merged
Gerrit-Project: pywikibot/core
Gerrit-Branch: master
Gerrit-Change-Id: Ic3b594dc7b1f8510691274202050869abe915100
Gerrit-Change-Number: 1123799
Gerrit-PatchSet: 4
Gerrit-Owner: Xqt <info(a)gno.de>
Gerrit-Reviewer: Xqt <info(a)gno.de>
Gerrit-Reviewer: jenkins-bot
Xqt has submitted this change. ( https://gerrit.wikimedia.org/r/c/pywikibot/core/+/1125636?usp=email )
Change subject: [tests] Update pre-commit hooks
......................................................................
[tests] Update pre-commit hooks
Change-Id: I827fe09160d3fac8f6775a8d661d370032886748
---
M .pre-commit-config.yaml
M pywikibot/pagegenerators/_generators.py
2 files changed, 3 insertions(+), 3 deletions(-)
Approvals:
Xqt: Verified; Looks good to me, approved
diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
index 8e264d5..687971b 100644
--- a/.pre-commit-config.yaml
+++ b/.pre-commit-config.yaml
@@ -58,7 +58,7 @@
- id: rst-inline-touching-normal
- id: text-unicode-replacement-char
- repo: https://github.com/astral-sh/ruff-pre-commit
- rev: v0.9.9
+ rev: v0.9.10
hooks:
- id: ruff
args:
@@ -86,7 +86,7 @@
- id: isort
exclude: '^pwb\.py$'
- repo: https://github.com/jshwi/docsig
- rev: v0.69.1
+ rev: v0.69.3
hooks:
- id: docsig
exclude: ^(tests|scripts)
diff --git a/pywikibot/pagegenerators/_generators.py b/pywikibot/pagegenerators/_generators.py
index e07544f..4200291 100644
--- a/pywikibot/pagegenerators/_generators.py
+++ b/pywikibot/pagegenerators/_generators.py
@@ -830,7 +830,7 @@
:param total: Maximum number of pages to retrieve in total
:param namespaces: search only in these namespaces (defaults to all)
:param site: Site for generator results.
- :keyword str \| None where: Where to search; value must be one of the
+ :keyword str | None where: Where to search; value must be one of the
given literals or None (many wikis do not support all search
types)
:keyword bool content: if True, load the current content of each
--
To view, visit https://gerrit.wikimedia.org/r/c/pywikibot/core/+/1125636?usp=email
To unsubscribe, or for help writing mail filters, visit https://gerrit.wikimedia.org/r/settings?usp=email
Gerrit-MessageType: merged
Gerrit-Project: pywikibot/core
Gerrit-Branch: master
Gerrit-Change-Id: I827fe09160d3fac8f6775a8d661d370032886748
Gerrit-Change-Number: 1125636
Gerrit-PatchSet: 1
Gerrit-Owner: Xqt <info(a)gno.de>
Gerrit-Reviewer: Xqt <info(a)gno.de>
Gerrit-Reviewer: jenkins-bot
Xqt has submitted this change. ( https://gerrit.wikimedia.org/r/c/pywikibot/core/+/1123786?usp=email )
Change subject: Cleanup: backport module is private, no deprecations here
......................................................................
Cleanup: backport module is private, no deprecations here
Change-Id: Ib6ac470c4a2e63aa0d64b4515d4bdd4df869403f
---
M ROADMAP.rst
1 file changed, 1 insertion(+), 1 deletion(-)
Approvals:
Xqt: Verified; Looks good to me, approved
jenkins-bot: Verified
diff --git a/ROADMAP.rst b/ROADMAP.rst
index ba96169..2769b09 100644
--- a/ROADMAP.rst
+++ b/ROADMAP.rst
@@ -37,7 +37,7 @@
* 9.0.0: ``iteritems`` method of :class:`data.api.Request` will be removed in favour of ``items``
* 9.0.0: ``SequenceOutputter.output()`` is deprecated in favour of :attr:`tools.formatter.SequenceOutputter.out`
property
-* 9.0.0: *nullcontext* context manager and *SimpleQueue* queue of :mod:`backports` are deprecated
+
Pending removal in Pywikibot 11
-------------------------------
--
To view, visit https://gerrit.wikimedia.org/r/c/pywikibot/core/+/1123786?usp=email
To unsubscribe, or for help writing mail filters, visit https://gerrit.wikimedia.org/r/settings?usp=email
Gerrit-MessageType: merged
Gerrit-Project: pywikibot/core
Gerrit-Branch: master
Gerrit-Change-Id: Ib6ac470c4a2e63aa0d64b4515d4bdd4df869403f
Gerrit-Change-Number: 1123786
Gerrit-PatchSet: 1
Gerrit-Owner: Xqt <info(a)gno.de>
Gerrit-Reviewer: Xqt <info(a)gno.de>
Gerrit-Reviewer: jenkins-bot