jenkins-bot has submitted this change. ( https://gerrit.wikimedia.org/r/c/pywikibot/core/+/1022548?usp=email )
Change subject: [fix] Increase read timeout for alllinks tests.
......................................................................
[fix] Increase read timeout for alllinks tests.
Using *namespace* option different from ``0`` needs a lot of time on
Wikidata site. Increase this value to 60s for tests and add an important
note to documentation.
Bug: T359427
Change-Id: I0b6c56883612fb165b9b792bcc00323d0a25b41a
---
M pywikibot/site/_generators.py
M tests/site_generators_tests.py
2 files changed, 29 insertions(+), 2 deletions(-)
Approvals:
Xqt: Looks good to me, approved
jenkins-bot: Verified
diff --git a/pywikibot/site/_generators.py b/pywikibot/site/_generators.py
index eaed600..4b48de3 100644
--- a/pywikibot/site/_generators.py
+++ b/pywikibot/site/_generators.py
@@ -428,7 +428,9 @@
) -> Generator[pywikibot.Page, None, None]:
"""Yield internal wikilinks contained (or transcluded) on page.
- .. seealso:: :api:`Links`
+ .. seealso::
+ - :api:`Links`
+ - :meth:`page.BasePage.linkedPages`
:param namespaces: Only iterate pages in these namespaces
(default: all)
@@ -993,7 +995,27 @@
:func:`tools.itertools.filter_unique` in that case which
might be memory intensive. Use it with care.
- .. seealso:: :api:`Alllinks`
+ .. important:: Using *namespace* option different from ``0``
+ needs a lot of time on Wikidata site. You have to increase
+ the **read** timeout part of ``socket_timeout`` in
+ :ref:`Http Settings` in your ``user-config.py`` file. Or
+ increase it patially within your code like:
+
+ .. code:: pytkon
+
+ from pywikibot import config
+ save_timeout = config.socket_timeout # save the timeout config
+ config.socket_timeout = save_timeout[0], 60
+ ... # your code here
+ config.socket_timeout = save_timeout # restore timeout config
+
+ The minimum read timeout value should be 60 seconds in that
+ case.
+
+ .. seealso::
+ - :api:`Alllinks`
+ - :meth:`pagebacklinks`
+ - :meth:`pagelinks`
:param start: Start at this title (page need not exist).
:param prefix: Only yield pages starting with this string.
diff --git a/tests/site_generators_tests.py b/tests/site_generators_tests.py
index 3fae8e7..189943c 100755
--- a/tests/site_generators_tests.py
+++ b/tests/site_generators_tests.py
@@ -363,10 +363,15 @@
msg=f"{page.title()} does not start with 'Fix'"
)
+ # increase timeout due to T359427/T359425
+ # ~ 47s are required on wikidata
+ config_timeout = pywikibot.config.socket_timeout
+ pywikibot.config.socket_timeout = (config_timeout[0], 60)
with self.subTest(msg='Test namespace parameter'):
for page in mysite.alllinks(namespace=1, total=5):
self.assertIsInstance(page, pywikibot.Page)
self.assertEqual(page.namespace(), 1)
+ pywikibot.config.socket_timeout = config_timeout
with self.subTest(msg='Test with fromids parameter'):
for page in mysite.alllinks(start='From', namespace=4,
--
To view, visit https://gerrit.wikimedia.org/r/c/pywikibot/core/+/1022548?usp=email
To unsubscribe, or for help writing mail filters, visit https://gerrit.wikimedia.org/r/settings
Gerrit-Project: pywikibot/core
Gerrit-Branch: master
Gerrit-Change-Id: I0b6c56883612fb165b9b792bcc00323d0a25b41a
Gerrit-Change-Number: 1022548
Gerrit-PatchSet: 1
Gerrit-Owner: Xqt <info(a)gno.de>
Gerrit-Reviewer: Xqt <info(a)gno.de>
Gerrit-Reviewer: jenkins-bot
Gerrit-MessageType: merged
jenkins-bot has submitted this change. ( https://gerrit.wikimedia.org/r/c/pywikibot/core/+/1019421?usp=email )
Change subject: [bugfix] Fix getversion_nightly() function and deprecate svn support
......................................................................
[bugfix] Fix getversion_nightly() function and deprecate svn support
Get nighly dump version for compat release. The version file was moved
from main to pywikibot folder in core branch. See:
https://github.com/pywikibot/Pywikibot-nightly-creator/blame/628c197c389bf2…
Deprecate svn support because github has desupported it and no svn
checkout be done for Pywikibot any longer. Older places on
wikimedia.org are given up for years and redirects to phabricator.
- remove .svnprops property file
- deprecate version.svn_rev_info() function and remove support for svn
older than 1.7
- remove version.github_svn_rev2hash() function which is no longer
functional because svn support was dropped at github
- deprecate version.getversion_svn() function and remove hsh check
- fix version.getversion_nightly() to use the right path for version
file
- update ROADMAP.rst
Bug: T362492
Bug: T362484
Change-Id: I8b859bab98d1ea95c4c1e9996ef827b42c1d3a61
---
D .svnprops
M ROADMAP.rst
M pywikibot/version.py
3 files changed, 42 insertions(+), 71 deletions(-)
Approvals:
Xqt: Looks good to me, approved
jenkins-bot: Verified
diff --git a/.svnprops b/.svnprops
deleted file mode 100644
index 3fed37c..0000000
--- a/.svnprops
+++ /dev/null
Binary files differ
diff --git a/ROADMAP.rst b/ROADMAP.rst
index e0cecfc..1e7751a 100644
--- a/ROADMAP.rst
+++ b/ROADMAP.rst
@@ -1,6 +1,9 @@
Current release
---------------
+* Detect nighly version file with :func:`version.getversion_nightly` (:phab:`T362492`)
+* :mod:`version`.github_svn_rev2hash() was removed; it was no longer functional (:phab:`T362484`)
+* SVN support has been dropped; ``.svnprops`` property settings was removed (:phab:`T362484`)
* Skip process that requires login to logout (:phab:`T326614`)
* File title of :class:`specialbots.UploadRobot` must have a valid file extension (:phab:`T345786`)
* Add a :attr:`post_processor<specialbots.UploadRobot.post_processor>` attribute to :class:`specialbots.UploadRobot`
@@ -64,6 +67,8 @@
Will be removed in Pywikibot 10
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+* 9.1.0: :func:`version.svn_rev_info` and :func:`version.getversion_svn` will be removed. SVN is no longer supported.
+ (:phab:`T362484`)
* 7.7.0: :mod:`tools.threading` classes should no longer imported from :mod:`tools`
* 7.6.0: :mod:`tools.itertools` datatypes should no longer imported from :mod:`tools`
* 7.6.0: :mod:`tools.collections` datatypes should no longer imported from :mod:`tools`
diff --git a/pywikibot/version.py b/pywikibot/version.py
index ae872d9..542c480 100644
--- a/pywikibot/version.py
+++ b/pywikibot/version.py
@@ -1,6 +1,6 @@
"""Module to determine the pywikibot version (tag, revision and date)."""
#
-# (C) Pywikibot team, 2007-2022
+# (C) Pywikibot team, 2007-2024
#
# Distributed under the terms of the MIT license.
#
@@ -11,14 +11,14 @@
import os
import pathlib
import socket
+import sqlite3
import subprocess
import sys
import sysconfig
import time
-import xml.dom.minidom
from contextlib import closing, suppress
from importlib import import_module
-from io import BytesIO
+from pathlib import Path
from warnings import warn
import pywikibot
@@ -26,7 +26,8 @@
from pywikibot.backports import cache
from pywikibot.comms.http import fetch
from pywikibot.exceptions import VersionParseError
-from pywikibot.tools import deprecated
+from pywikibot.tools import deprecated, suppress_warnings
+from pywikibot.tools._deprecate import _NotImplementedWarning
def _get_program_dir() -> str:
@@ -67,7 +68,7 @@
data['cmp_ver'] = 'UNKNOWN'
else:
for branch, path in branches.items():
- with suppress(Exception):
+ with suppress(VersionParseError):
hsh[getversion_onlinerepo(path)] = branch
if hsh:
data['cmp_ver'] = hsh.get(local_hsh, 'OUTDATED')
@@ -94,9 +95,12 @@
getversion_nightly,
getversion_package):
try:
- (tag, rev, date, hsh) = vcs_func(_program_dir)
+ with suppress_warnings(
+ f'.*({vcs_func.__name__}|svn_rev_info) is deprecated since '
+ 'release 9.1.', _NotImplementedWarning):
+ tag, rev, date, hsh = vcs_func(_program_dir)
except Exception as e:
- exceptions[vcs_func] = e
+ exceptions[vcs_func] = vcs_func.__name__, e
else:
break
else:
@@ -105,8 +109,8 @@
# pywikibot was imported without using version control at all.
tag, rev, date, hsh = (
'', '-1 (unknown)', '0 (unknown)', '(unknown)')
- warn('Unable to detect version; exceptions raised:\n{!r}'
- .format(exceptions), UserWarning)
+ warn(f'Unable to detect version; exceptions raised:\n{exceptions!r}',
+ UserWarning)
exceptions = None
# Git and SVN can silently fail, as it may be a nightly.
@@ -124,9 +128,15 @@
return {'tag': tag, 'rev': rev, 'date': datestring, 'hsh': hsh}
-def svn_rev_info(path): # pragma: no cover
+@deprecated(since='9.1')
+def svn_rev_info(path):
"""Fetch information about the current revision of a Subversion checkout.
+ .. deprecated:: 9.1
+ update to git repository.
+ .. versionchanged:: 9.1
+ drop support for svn 1.6 and older.
+
:param path: directory of the Subversion checkout
:return:
- tag (name for the repository),
@@ -137,31 +147,8 @@
if not os.path.isdir(os.path.join(path, '.svn')):
path = os.path.join(path, '..')
- _program_dir = path
- filename = os.path.join(_program_dir, '.svn/entries')
- if os.path.isfile(filename):
- with open(filename) as entries:
- version = entries.readline().strip()
- if version != '12':
- for _ in range(3):
- entries.readline()
- tag = entries.readline().strip()
- t = tag.split('://', 1)
- t[1] = t[1].replace('svn.wikimedia.org/svnroot/pywikipedia/',
- '')
- tag = '[{}] {}'.format(*t)
- for _ in range(4):
- entries.readline()
- date = time.strptime(entries.readline()[:19],
- '%Y-%m-%dT%H:%M:%S')
- rev = entries.readline()[:-1]
- return tag, rev, date
-
- # We haven't found the information in entries file.
- # Use sqlite table for new entries format
- from sqlite3 import dbapi2 as sqlite
with closing(
- sqlite.connect(os.path.join(_program_dir, '.svn/wc.db'))) as con:
+ sqlite3.connect(os.path.join(path, '.svn/wc.db'))) as con:
cur = con.cursor()
cur.execute("""select
local_relpath, repos_path, revision, changed_date, checksum from nodes
@@ -175,52 +162,28 @@
return tag, rev, date
-def github_svn_rev2hash(tag: str, rev): # pragma: no cover
- """Convert a Subversion revision to a Git hash using GitHub.
-
- :param tag: name of the Subversion repo on GitHub
- :param rev: Subversion revision identifier
- :return: the git hash
- """
- uri = f'https://github.com/wikimedia/{tag}/!svn/vcc/default'
- request = fetch(uri, method='PROPFIND',
- data="<?xml version='1.0' encoding='utf-8'?>"
- '<propfind xmlns=\"DAV:\"><allprop/></propfind>',
- headers={'label': str(rev),
- 'user-agent': 'SVN/1.7.5 {pwb}'})
- dom = xml.dom.minidom.parse(BytesIO(request.content))
- hsh = dom.getElementsByTagName('C:git-commit')[0].firstChild.nodeValue
- date = dom.getElementsByTagName('S:date')[0].firstChild.nodeValue
- date = time.strptime(date[:19], '%Y-%m-%dT%H:%M:%S')
- return hsh, date
-
-
-def getversion_svn(path=None): # pragma: no cover
+@deprecated(since='9.1')
+def getversion_svn(path=None):
"""Get version info for a Subversion checkout.
+ .. deprecated:: 9.1
+ update to git repository.
+
:param path: directory of the Subversion checkout
:return:
- tag (name for the repository),
- rev (current Subversion revision identifier),
- date (date of current revision),
- - hash (git hash for the Subversion revision)
+ - hash '(unknown)'
:rtype: ``tuple`` of three ``str`` and a ``time.struct_time``
"""
_program_dir = path or _get_program_dir()
tag, rev, date = svn_rev_info(_program_dir)
- hsh, date2 = github_svn_rev2hash(tag, rev)
- if date.tm_isdst >= 0 and date2.tm_isdst >= 0:
- assert date == date2, 'Date of version is not consistent'
- # date.tm_isdst is -1 means unknown state
- # compare its contents except daylight saving time status
- else:
- for i in range(len(date) - 1):
- assert date[i] == date2[i], 'Date of version is not consistent'
-
rev = f's{rev}'
+
if (not date or not tag or not rev) and not path:
raise VersionParseError
- return (tag, rev, date, hsh)
+ return (tag, rev, date, '(unknown)')
def getversion_git(path=None):
@@ -278,9 +241,13 @@
return (tag, rev, date, hsh)
-def getversion_nightly(path=None): # pragma: no cover
+def getversion_nightly(path: str | Path | None = None): # pragma: no cover
"""Get version info for a nightly release.
+ .. hint::
+ the version informations of the nightly dump is stored in the
+ ``version`` file within the ``pywikibot`` folder.
+
:param path: directory of the uncompressed nightly.
:return:
- tag (name for the repository),
@@ -289,10 +256,9 @@
- hash (git hash for the current revision)
:rtype: ``tuple`` of three ``str`` and a ``time.struct_time``
"""
- if not path:
- path = _get_program_dir()
+ file = Path(path or _get_program_dir()) / 'pywikibot' / 'version'
- with open(os.path.join(path, 'version')) as data:
+ with file.open() as data:
(tag, rev, date, hsh) = data.readlines()
date = time.strptime(date[:19], '%Y-%m-%dT%H:%M:%S')
@@ -352,7 +318,7 @@
return None
program_dir = _get_program_dir()
- if filename[:len(program_dir)] == program_dir:
+ if filename.startswith(program_dir):
return filename
return None
--
To view, visit https://gerrit.wikimedia.org/r/c/pywikibot/core/+/1019421?usp=email
To unsubscribe, or for help writing mail filters, visit https://gerrit.wikimedia.org/r/settings
Gerrit-Project: pywikibot/core
Gerrit-Branch: master
Gerrit-Change-Id: I8b859bab98d1ea95c4c1e9996ef827b42c1d3a61
Gerrit-Change-Number: 1019421
Gerrit-PatchSet: 8
Gerrit-Owner: Xqt <info(a)gno.de>
Gerrit-Reviewer: Merlijn van Deen <valhallasw(a)arctus.nl>
Gerrit-Reviewer: Xqt <info(a)gno.de>
Gerrit-Reviewer: jenkins-bot
Gerrit-MessageType: merged
jenkins-bot has submitted this change. ( https://gerrit.wikimedia.org/r/c/pywikibot/core/+/1022005?usp=email )
Change subject: [fix] use filter_unique() in Site.alllinks() for MW >= 1.43
......................................................................
[fix] use filter_unique() in Site.alllinks() for MW >= 1.43
unique parameter is not supported with MW 1.43 currently and it might
be dropped in misermode. Therefore use filter_unique() to ensure
getting unique pages.
Bug: T359427
Change-Id: I16b7bd439dccfcc67b814e913955dea02a2700b4
---
M pywikibot/site/_generators.py
M tests/site_generators_tests.py
2 files changed, 26 insertions(+), 14 deletions(-)
Approvals:
Xqt: Looks good to me, approved
jenkins-bot: Verified
diff --git a/pywikibot/site/_generators.py b/pywikibot/site/_generators.py
index 457ae3c..a1be9e6 100644
--- a/pywikibot/site/_generators.py
+++ b/pywikibot/site/_generators.py
@@ -984,33 +984,47 @@
) -> Generator[pywikibot.Page, None, None]:
"""Iterate all links to pages (which need not exist) in one namespace.
- Note that, in practice, links that were found on pages that have
- been deleted may not have been removed from the links table, so this
- method can return false positives.
+ .. note:: In practice, links that were found on pages that have
+ been deleted may not have been removed from the links table,
+ so this method can return false positives.
+
+ .. caution:: *unique* parameter is no longer supported by
+ MediaWiki 1.43 or higher. Pywikibot uses
+ :func:`tools.itertools.filter_unique` in that case which
+ might be memory intensive. Use it with care.
.. seealso:: :api:`Alllinks`
:param start: Start at this title (page need not exist).
:param prefix: Only yield pages starting with this string.
:param namespace: Iterate pages from this (single) namespace
- :param unique: If True, only iterate each link title once (default:
- iterate once for each linking page)
- :param fromids: if True, include the pageid of the page containing
- each link (default: False) as the '_fromid' attribute of the Page;
- cannot be combined with unique
- :raises KeyError: the namespace identifier was not resolved
- :raises TypeError: the namespace identifier has an inappropriate
- type such as bool, or an iterable with more than one namespace
+ :param unique: If True, only iterate each link title once
+ (default: False)
+ :param fromids: if True, include the pageid of the page
+ containing each link (default: False) as the '_fromid'
+ attribute of the Page; cannot be combined with *unique*
+ :raises KeyError: the *namespace* identifier was not resolved
+ :raises TypeError: the *namespace* identifier has an
+ inappropriate type such as bool, or an iterable with more
+ than one namespace
"""
if unique and fromids:
raise Error('alllinks: unique and fromids cannot both be True.')
algen = self._generator(api.ListGenerator, type_arg='alllinks',
namespaces=namespace, alfrom=start,
- total=total, alunique=unique)
+ total=total)
if prefix:
algen.request['alprefix'] = prefix
if fromids:
algen.request['alprop'] = 'title|ids'
+ if not unique:
+ pass
+ elif self.mw_version < '1.43':
+ algen.request['alunique'] = True
+ else:
+ # unique filter for mw >= 1.43, use (title, ns) as key
+ # See: T359425, T359427
+ algen = filter_unique(algen, key=lambda x: (x['title'], x['ns']))
for link in algen:
p = pywikibot.Page(self, link['title'], link['ns'])
if fromids:
diff --git a/tests/site_generators_tests.py b/tests/site_generators_tests.py
index 87f97ba..bc56f4a 100755
--- a/tests/site_generators_tests.py
+++ b/tests/site_generators_tests.py
@@ -338,8 +338,6 @@
def test_all_links(self):
"""Test the site.alllinks() method."""
mysite = self.get_site()
- if mysite.sitename in ('wikipedia:de', 'wikipedia:en'):
- self.skipTest(f'skipping test on {mysite} due to T359427')
fwd = list(mysite.alllinks(total=10))
uniq = list(mysite.alllinks(total=10, unique=True))
--
To view, visit https://gerrit.wikimedia.org/r/c/pywikibot/core/+/1022005?usp=email
To unsubscribe, or for help writing mail filters, visit https://gerrit.wikimedia.org/r/settings
Gerrit-Project: pywikibot/core
Gerrit-Branch: master
Gerrit-Change-Id: I16b7bd439dccfcc67b814e913955dea02a2700b4
Gerrit-Change-Number: 1022005
Gerrit-PatchSet: 3
Gerrit-Owner: Xqt <info(a)gno.de>
Gerrit-Reviewer: Xqt <info(a)gno.de>
Gerrit-Reviewer: jenkins-bot
Gerrit-MessageType: merged