jenkins-bot has submitted this change. ( https://gerrit.wikimedia.org/r/c/pywikibot/core/+/1074562?usp=email )
Change subject: pywikibot.scripts: Remove preload_sites.py ......................................................................
pywikibot.scripts: Remove preload_sites.py
This script mno longer works with the current login implementation.
Bug: T348925 Change-Id: I18ef387df276530b60eb9d07b81685b65fc64e75 --- M docs/faq.rst M docs/utilities/scripts.rst M docs/utilities/scripts_ref.rst M pywikibot/CONTENT.rst M pywikibot/scripts/__init__.py D pywikibot/scripts/preload_sites.py 6 files changed, 17 insertions(+), 162 deletions(-)
Approvals: jenkins-bot: Verified Xqt: Looks good to me, approved
diff --git a/docs/faq.rst b/docs/faq.rst index c422292..483909c 100644 --- a/docs/faq.rst +++ b/docs/faq.rst @@ -4,24 +4,21 @@
**How to speed up Pywikibot?** - 1. The first time you are using Pywikibot for multiple Wikimedia sites you - can run :py:mod:`preload_sites <pywikibot.scripts.preload_sites>` script - to preload site info quickly. - 2. If you need the content, use :py:mod:`PreloadingGenerator - <pagegenerators.PreloadingGenerator>` with page generators, - :py:mod:`EntityGenerator <pagegenerators.EntityGenerator>` - for wikibase entities and :py:mod:`DequePreloadingGenerator - <pagegenerators.DequePreloadingGenerator>` for a - :py:mod:`DequeGenerator <tools.collections.DequeGenerator>`. - 3. If you use :py:mod:`GeneratorFactory - <pagegenerators.GeneratorFactory>` with your bot and use its - :py:mod:`getCombinedGenerator - <pagegenerators.GeneratorFactory.getCombinedGenerator>` method - you can set ``preload=True`` to preload page content. This is an alternate - to the ``PreloadingGenerator`` function mentioned above. - 4. Use :py:mod:`MySQLPageGenerator - <pagegenerators.MySQLPageGenerator >` if direct DB access is - available and appropriate. See also: :manpage:`MySQL` + * If you need the content, use :py:mod:`PreloadingGenerator + <pagegenerators.PreloadingGenerator>` with page generators, + :func:`PreloadingEntityGenerator <pagegenerators.PreloadingEntityGenerator>` + for wikibase entities and :py:mod:`DequePreloadingGenerator + <pagegenerators.DequePreloadingGenerator>` for a + :py:mod:`DequeGenerator <tools.collections.DequeGenerator>`. + * If you use :py:mod:`GeneratorFactory + <pagegenerators.GeneratorFactory>` with your bot and use its + :py:mod:`getCombinedGenerator + <pagegenerators.GeneratorFactory.getCombinedGenerator>` method + you can set ``preload=True`` to preload page content. This is an alternate + to the ``PreloadingGenerator`` function mentioned above. + * Use :func:`MySQLPageGenerator + <pagegenerators.MySQLPageGenerator>` if direct DB access is + available and appropriate. See also: :manpage:`MySQL`
**The bot cannot delete pages** Your account needs delete rights on your wiki. If you have setup another diff --git a/docs/utilities/scripts.rst b/docs/utilities/scripts.rst index 474bca9..a55ee54 100644 --- a/docs/utilities/scripts.rst +++ b/docs/utilities/scripts.rst @@ -34,13 +34,6 @@ :no-members: :noindex:
-preload_sites script -==================== - -.. automodule:: pywikibot.scripts.preload_sites - :no-members: - :noindex: - shell script ============
diff --git a/docs/utilities/scripts_ref.rst b/docs/utilities/scripts_ref.rst index d5acf52..1f796e5 100644 --- a/docs/utilities/scripts_ref.rst +++ b/docs/utilities/scripts_ref.rst @@ -30,12 +30,6 @@ .. automodule:: pywikibot.scripts.login :synopsis: Script to log the bot in to a wiki account
-preload_sites script -==================== - -.. automodule:: pywikibot.scripts.preload_sites - :synopsis: Script that preloads site and user info for all sites of given family - shell script ============
diff --git a/pywikibot/CONTENT.rst b/pywikibot/CONTENT.rst index c1a131e..4f02a96 100644 --- a/pywikibot/CONTENT.rst +++ b/pywikibot/CONTENT.rst @@ -158,10 +158,6 @@ +----------------------------+------------------------------------------------------+ | login.py | Script to log the bot in to a wiki account. | +----------------------------+------------------------------------------------------+ - | preload_sites.py | Preload and cache site information for each | - | | WikiMedia family within seconds. Useful for bots | - | | running on multiple sites. | - +----------------------------+------------------------------------------------------+ | shell.py | Spawns an interactive Python shell with pywikibot | | | imported | +----------------------------+------------------------------------------------------+ diff --git a/pywikibot/scripts/__init__.py b/pywikibot/scripts/__init__.py index fabe48b..73a1e55 100644 --- a/pywikibot/scripts/__init__.py +++ b/pywikibot/scripts/__init__.py @@ -1,6 +1,8 @@ """Folder which holds framework scripts.
.. versionadded:: 7.0 +.. versionremoved:: 9.4 + ``preload_sites`` script was removed (:phab:`T348925`). """ # # (C) Pywikibot team, 2021-2022 diff --git a/pywikibot/scripts/preload_sites.py b/pywikibot/scripts/preload_sites.py deleted file mode 100755 index 6bd3b66..0000000 --- a/pywikibot/scripts/preload_sites.py +++ /dev/null @@ -1,127 +0,0 @@ -#!/usr/bin/env python3 -"""Script that preloads site and user info for all sites of given family. - -The following parameters are supported: - - -worker:<num> The number of parallel tasks to be run. Default is the - number of processors on the machine - -**Usage:** - - python pwb.py preload_sites [{<family>}] [-worker:{<num>}] - -To force preloading, change the global expiry values to 0: - - python pwb.py -API_config_expiry:0 -API_uinfo_expiry:0 \ - preload_sites [{<family>}] - -or run the :mod:`cache<scripts.maintenance.cache>` script previeously: - - python pwb.py cache -delete - -.. versionchanged:: 7.4 - script was moved to the framework scripts folder. -""" -# -# (C) Pywikibot team, 2021-2024 -# -# Distributed under the terms of the MIT license. -# -from __future__ import annotations - -from concurrent.futures import ThreadPoolExecutor, wait -from datetime import datetime - -import pywikibot -from pywikibot.backports import removeprefix -from pywikibot.family import Family - - -try: # Python 3.13 - from os import process_cpu_count # type: ignore[attr-defined] -except ImportError: - from os import cpu_count as process_cpu_count - - -#: supported families by this script -families_list = [ - 'wikibooks', - 'wikinews', - 'wikipedia', - 'wikiquote', - 'wikisource', - 'wikiversity', - 'wikivoyage', - 'wiktionary', -] - -# Ignore sites from preloading -# example: {'wikiversity': ['beta'], } -exceptions: dict[str, list[str]] = { -} - - -def preload_family(family: str, executor: ThreadPoolExecutor) -> None: - """Preload all sites of a single family file. - - .. versionchanged:: 9.2 - use a separate worker thread for each site. - """ - - def create_page(code, family): - """Preload siteinfo and userinfo.""" - site = pywikibot.Site(code, family) - pywikibot.Page(site, 'Main Page') - - msg = 'Preloading sites of {} family{}' - pywikibot.info(msg.format(family, '...')) - - codes = Family.load(family).codes - for code in exceptions.get(family, []): - if code in codes: - codes.remove(code) - - obsolete = Family.load(family).obsolete - - futures = set() - for code in codes: - if code not in obsolete: - futures.add(executor.submit(create_page, code, family)) - wait(futures) - pywikibot.info(msg.format(family, ' completed.')) - - -def preload_families(families: list[str] | set[str], - worker: int | None) -> None: - """Preload all sites of all given family files. - - .. versionchanged:: 7.3 - Default of worker is calculated like for Python 3.8 but preserves - at least one worker for each element in families_list for better - performance. - """ - start = datetime.now() - if worker is None: - # Python 3.13 default - worker = min(32, (process_cpu_count() or 1) + 4) - # to allow adding futures in preload_family the workers must be one - # more than families are handled - worker = max(len(families) * 2, worker) - pywikibot.info( - f'Using {worker} workers to process {len(families)} families') - with ThreadPoolExecutor(worker) as executor: - futures = {executor.submit(preload_family, family, executor) - for family in families} - wait(futures) - pywikibot.info(f'Loading time used: {datetime.now() - start}') - - -if __name__ == '__main__': - fam = set() - worker = None - for arg in pywikibot.handle_args(): - if arg in families_list: - fam.add(arg) - elif arg.startswith('-worker:'): - worker = int(removeprefix(arg, '-worker:')) - preload_families(fam or families_list, worker)
pywikibot-commits@lists.wikimedia.org