jenkins-bot has submitted this change. ( https://gerrit.wikimedia.org/r/c/pywikibot/core/+/699544 )
Change subject: [fix] Replace Site factory with APISite class for type
......................................................................
[fix] Replace Site factory with APISite class for type
Bug: T284880
Change-Id: Iee7cf756abb9f326b3c9c146f4c513e42d5ce119
---
M pywikibot/pagegenerators.py
1 file changed, 1 insertion(+), 1 deletion(-)
Approvals:
Xqt: Looks good to me, approved
jenkins-bot: Verified
diff --git a/pywikibot/pagegenerators.py b/pywikibot/pagegenerators.py
index 2ecd9f5..9c59f70 100644
--- a/pywikibot/pagegenerators.py
+++ b/pywikibot/pagegenerators.py
@@ -1580,7 +1580,7 @@
def TextIOPageGenerator(source: Optional[str] = None,
- site: Optional[pywikibot.Site] = None):
+ site: Optional[pywikibot.site.BaseSite] = None):
"""Iterate pages from a list in a text file or on a webpage.
The text source must contain page links between double-square-brackets or,
--
To view, visit https://gerrit.wikimedia.org/r/c/pywikibot/core/+/699544
To unsubscribe, or for help writing mail filters, visit https://gerrit.wikimedia.org/r/settings
Gerrit-Project: pywikibot/core
Gerrit-Branch: master
Gerrit-Change-Id: Iee7cf756abb9f326b3c9c146f4c513e42d5ce119
Gerrit-Change-Number: 699544
Gerrit-PatchSet: 1
Gerrit-Owner: JJMC89 <JJMC89.Wikimedia(a)gmail.com>
Gerrit-Reviewer: Xqt <info(a)gno.de>
Gerrit-Reviewer: jenkins-bot
Gerrit-MessageType: merged
jenkins-bot has submitted this change. ( https://gerrit.wikimedia.org/r/c/pywikibot/core/+/698328 )
Change subject: [bugfix] clear put_queue when canceling page save
......................................................................
[bugfix] clear put_queue when canceling page save
The pywikibot._flush method is called twice. one time by BaseBot.exit()
which calls pywikibot.stopme() and the second time at exit time because
atexit registerd pywikibot._flush. But the queue isn't cleared.
Clear the queue if page saving is canceled.
Bug: T284396
Change-Id: Ie30a82ad52b9268ba3d8fedea89c36d7fee57de1
---
M pywikibot/__init__.py
1 file changed, 7 insertions(+), 6 deletions(-)
Approvals:
Xqt: Looks good to me, approved
jenkins-bot: Verified
diff --git a/pywikibot/__init__.py b/pywikibot/__init__.py
index 7fd8856..d102841 100644
--- a/pywikibot/__init__.py
+++ b/pywikibot/__init__.py
@@ -1306,6 +1306,11 @@
'Estimated time remaining: {}\nReally exit?'
.format(*remaining()),
default=False, automatic_quit=False):
+ # delete the put queue
+ with page_put_queue.mutex:
+ page_put_queue.all_tasks_done.notify_all()
+ page_put_queue.queue.clear()
+ page_put_queue.not_full.notify_all()
break
# only need one drop() call because all throttles use the same global pid
@@ -1333,12 +1338,8 @@
def async_request(request, *args, **kwargs):
"""Put a request on the queue, and start the daemon if necessary."""
if not _putthread.is_alive():
- try:
- page_put_queue.mutex.acquire()
- with suppress(AssertionError, RuntimeError):
- _putthread.start()
- finally:
- page_put_queue.mutex.release()
+ with page_put_queue.mutex, suppress(AssertionError, RuntimeError):
+ _putthread.start()
page_put_queue.put((request, args, kwargs))
--
To view, visit https://gerrit.wikimedia.org/r/c/pywikibot/core/+/698328
To unsubscribe, or for help writing mail filters, visit https://gerrit.wikimedia.org/r/settings
Gerrit-Project: pywikibot/core
Gerrit-Branch: master
Gerrit-Change-Id: Ie30a82ad52b9268ba3d8fedea89c36d7fee57de1
Gerrit-Change-Number: 698328
Gerrit-PatchSet: 2
Gerrit-Owner: Xqt <info(a)gno.de>
Gerrit-Reviewer: Dvorapa <dvorapa(a)seznam.cz>
Gerrit-Reviewer: Mpaa <mpaa.wiki(a)gmail.com>
Gerrit-Reviewer: Xqt <info(a)gno.de>
Gerrit-Reviewer: Zhuyifei1999 <zhuyifei1999(a)gmail.com>
Gerrit-Reviewer: jenkins-bot
Gerrit-MessageType: merged
jenkins-bot has submitted this change. ( https://gerrit.wikimedia.org/r/c/pywikibot/core/+/698655 )
Change subject: pagegenerators: Add -url option
......................................................................
pagegenerators: Add -url option
Allow users to create page generators based on a URL pointing to a page containing page titles.
Works much like the -file argument, but instead of taking a local filenames takes a URL instead.
Bug: T239436
Change-Id: I08150994fb14f44afdc79bab086d13b2d2a74fc2
---
M pywikibot/pagegenerators.py
M scripts/interwiki.py
M scripts/movepages.py
M tests/pagegenerators_tests.py
4 files changed, 90 insertions(+), 37 deletions(-)
Approvals:
Xqt: Looks good to me, approved
jenkins-bot: Verified
diff --git a/pywikibot/pagegenerators.py b/pywikibot/pagegenerators.py
index d72d391..2ecd9f5 100644
--- a/pywikibot/pagegenerators.py
+++ b/pywikibot/pagegenerators.py
@@ -20,6 +20,7 @@
import calendar
import codecs
import datetime
+import io
import itertools
import json
import re
@@ -31,6 +32,7 @@
from http import HTTPStatus
from itertools import zip_longest
from typing import Optional, Union
+from urllib.parse import urlparse
from requests.exceptions import ReadTimeout
@@ -314,6 +316,11 @@
-querypage shows special pages available.
+-url Read a list of pages to treat from the provided URL.
+ The URL must return text in the same format as expected for
+ the -file argument, e.g. page titles separated by newlines
+ or enclosed in brackets.
+
FILTER OPTIONS
==============
@@ -812,6 +819,12 @@
return self.site.querypage(value)
+ def _handle_url(self, value):
+ """Handle `-url` argument."""
+ if not value:
+ value = pywikibot.input('Please enter the URL:')
+ return TextIOPageGenerator(value, site=self.site)
+
def _handle_unusedfiles(self, value):
"""Handle `-unusedfiles` argument."""
return self.site.unusedfiles(total=_int_none(value))
@@ -926,7 +939,7 @@
"""Handle `-file` argument."""
if not value:
value = pywikibot.input('Please enter the local file name:')
- return TextfilePageGenerator(value, site=self.site)
+ return TextIOPageGenerator(value, site=self.site)
def _handle_namespaces(self, value):
"""Handle `-namespaces` argument."""
@@ -1532,43 +1545,68 @@
content=content) # pragma: no cover
-def TextfilePageGenerator(filename: Optional[str] = None, site=None):
- """Iterate pages from a list in a text file.
+def _yield_titles(f: Union[codecs.StreamReaderWriter, io.StringIO],
+ site: pywikibot.Site):
+ """Yield page titles from a text stream.
- The file must contain page links between double-square-brackets or, in
- alternative, separated by newlines. The generator will yield each
+ :param f: text stream object
+ :type f: codecs.StreamReaderWriter, io.StringIO, or any other stream-like
+ object
+ :param site: Site for generator results.
+ :type site: :py:obj:`pywikibot.site.BaseSite`
+ :return: a generator that yields Page objects of pages with titles in text
+ stream
+ :rtype: generator
+ """
+ linkmatch = None
+ for linkmatch in pywikibot.link_regex.finditer(f.read()):
+ # If the link is in interwiki format, the Page object may reside
+ # on a different Site than the default.
+ # This makes it possible to work on different wikis using a single
+ # text file, but also could be dangerous because you might
+ # inadvertently change pages on another wiki!
+ yield pywikibot.Page(pywikibot.Link(linkmatch.group('title'),
+ site))
+ if linkmatch is not None:
+ return
+
+ f.seek(0)
+ for title in f:
+ title = title.strip()
+ if '|' in title:
+ title = title[:title.index('|')]
+ if title:
+ yield pywikibot.Page(site, title)
+
+
+def TextIOPageGenerator(source: Optional[str] = None,
+ site: Optional[pywikibot.Site] = None):
+ """Iterate pages from a list in a text file or on a webpage.
+
+ The text source must contain page links between double-square-brackets or,
+ alternatively, separated by newlines. The generator will yield each
corresponding Page object.
- :param filename: the name of the file that should be read. If no name is
+ :param source: the file path or URL that should be read. If no name is
given, the generator prompts the user.
:param site: Site for generator results.
:type site: :py:obj:`pywikibot.site.BaseSite`
"""
- if filename is None:
- filename = pywikibot.input('Please enter the filename:')
+ if source is None:
+ source = pywikibot.input('Please enter the filename / URL:')
if site is None:
site = pywikibot.Site()
- with codecs.open(filename, 'r', config.textfile_encoding) as f:
- linkmatch = None
- for linkmatch in pywikibot.link_regex.finditer(f.read()):
- # If the link is in interwiki format, the Page object may reside
- # on a different Site than the default.
- # This makes it possible to work on different wikis using a single
- # text file, but also could be dangerous because you might
- # inadvertently change pages on another wiki!
- yield pywikibot.Page(pywikibot.Link(linkmatch.group('title'),
- site))
- if linkmatch is not None:
- return
-
- f.seek(0)
- for title in f:
- title = title.strip()
- if '|' in title:
- title = title[:title.index('|')]
- if title:
- yield pywikibot.Page(site, title)
+ # If source cannot be parsed as an HTTP URL, treat as local file
+ if not urlparse(source).scheme:
+ with codecs.open(source, 'r', config.textfile_encoding) as f:
+ yield from _yield_titles(f, site)
+ # Else, fetch page (page should return text in same format as that expected
+ # in filename, i.e. pages separated by newlines or pages enclosed in double
+ # brackets
+ else:
+ with io.StringIO(http.fetch(source).text) as f:
+ yield from _yield_titles(f, site)
def PagesFromTitlesGenerator(iterable, site=None):
@@ -2966,6 +3004,8 @@
PreloadingItemGenerator = redirect_func(PreloadingEntityGenerator,
old_name='PreloadingItemGenerator',
since='20170314')
+TextfilePageGenerator = redirect_func(
+ TextIOPageGenerator, old_name='TextfilePageGenerator', since='20210611')
if __name__ == '__main__': # pragma: no cover
pywikibot.output('Pagegenerators cannot be run as script - are you '
diff --git a/scripts/interwiki.py b/scripts/interwiki.py
index b32a085..3579259 100755
--- a/scripts/interwiki.py
+++ b/scripts/interwiki.py
@@ -512,7 +512,7 @@
if value.isdigit():
self.needlimit = int(value)
elif arg == 'skipfile':
- skip_page_gen = pagegenerators.TextfilePageGenerator(value)
+ skip_page_gen = pagegenerators.TextIOPageGenerator(value)
self.skip.update(skip_page_gen)
del skip_page_gen
elif arg == 'neverlink':
@@ -521,7 +521,7 @@
self.ignore += [pywikibot.Page(pywikibot.Site(), p)
for p in value.split(',')]
elif arg == 'ignorefile':
- ignore_page_gen = pagegenerators.TextfilePageGenerator(value)
+ ignore_page_gen = pagegenerators.TextIOPageGenerator(value)
self.ignore.update(ignore_page_gen)
del ignore_page_gen
elif arg == 'showpage':
@@ -2298,7 +2298,7 @@
continue
pywikibot.output('Retrieving pages from dump file ' + tail)
- for page in pagegenerators.TextfilePageGenerator(filename, site):
+ for page in pagegenerators.TextIOPageGenerator(filename, site):
if site == self.site:
self._next_page = page.title(with_ns=False) + '!'
self._next_namespace = page.namespace()
diff --git a/scripts/movepages.py b/scripts/movepages.py
index cefed76..eadd26a 100755
--- a/scripts/movepages.py
+++ b/scripts/movepages.py
@@ -197,7 +197,7 @@
else:
filename = arg[len('-pairsfile:'):]
oldName1 = None
- for page in pagegenerators.TextfilePageGenerator(filename):
+ for page in pagegenerators.TextIOPageGenerator(filename):
if oldName1:
fromToPairs.append([oldName1, page.title()])
oldName1 = None
diff --git a/tests/pagegenerators_tests.py b/tests/pagegenerators_tests.py
index 120469c..f04f50d 100644
--- a/tests/pagegenerators_tests.py
+++ b/tests/pagegenerators_tests.py
@@ -424,7 +424,7 @@
self.assertLength({item['revid'] for item in items}, self.length)
-class TestTextfilePageGenerator(DefaultSiteTestCase):
+class TestTextIOPageGenerator(DefaultSiteTestCase):
"""Test loading pages from a textfile."""
@@ -444,10 +444,10 @@
)
def test_brackets(self):
- """Test TextfilePageGenerator with brackets."""
+ """Test TextIOPageGenerator with brackets."""
filename = join_data_path('pagelist-brackets.txt')
site = self.get_site()
- titles = list(pagegenerators.TextfilePageGenerator(filename, site))
+ titles = list(pagegenerators.TextIOPageGenerator(filename, site))
self.assertLength(titles, self.expected_titles)
expected_titles = [
expected_title[self.title_columns[site.namespaces[page.namespace()]
@@ -456,10 +456,10 @@
self.assertPageTitlesEqual(titles, expected_titles)
def test_lines(self):
- """Test TextfilePageGenerator with newlines."""
+ """Test TextIOPageGenerator with newlines."""
filename = join_data_path('pagelist-lines.txt')
site = self.get_site()
- titles = list(pagegenerators.TextfilePageGenerator(filename, site))
+ titles = list(pagegenerators.TextIOPageGenerator(filename, site))
self.assertLength(titles, self.expected_titles)
expected_titles = [
expected_title[self.title_columns[site.namespaces[page.namespace()]
@@ -467,6 +467,19 @@
for expected_title, page in zip(self.expected_titles, titles)]
self.assertPageTitlesEqual(titles, expected_titles)
+ @unittest.mock.patch('pywikibot.comms.http.fetch', autospec=True)
+ def test_url(self, mock_fetch):
+ """Test TextIOPageGenerator with URL."""
+ # Mock return value of fetch()
+ fetch_return = unittest.mock.Mock()
+ fetch_return.text = '\n'.join(
+ [title[0] for title in self.expected_titles])
+ mock_fetch.return_value = fetch_return
+ site = self.get_site()
+ titles = list(
+ pagegenerators.TextIOPageGenerator('http://www.someurl.org', site))
+ self.assertLength(titles, self.expected_titles)
+
class TestYearPageGenerator(DefaultSiteTestCase):
--
To view, visit https://gerrit.wikimedia.org/r/c/pywikibot/core/+/698655
To unsubscribe, or for help writing mail filters, visit https://gerrit.wikimedia.org/r/settings
Gerrit-Project: pywikibot/core
Gerrit-Branch: master
Gerrit-Change-Id: I08150994fb14f44afdc79bab086d13b2d2a74fc2
Gerrit-Change-Number: 698655
Gerrit-PatchSet: 5
Gerrit-Owner: Chris Maynor <cmchrismaynor(a)gmail.com>
Gerrit-Reviewer: Xqt <info(a)gno.de>
Gerrit-Reviewer: jenkins-bot
Gerrit-MessageType: merged
jenkins-bot has submitted this change. ( https://gerrit.wikimedia.org/r/c/pywikibot/core/+/699483 )
Change subject: [doc] Update ROADMAP.rst and other docs
......................................................................
[doc] Update ROADMAP.rst and other docs
Change-Id: I4d60323be9c4bd0e6ace04201d99bc627af8886e
---
M ROADMAP.rst
M pywikibot/specialbots/_upload.py
M pywikibot/textlib.py
3 files changed, 8 insertions(+), 0 deletions(-)
Approvals:
Xqt: Looks good to me, approved
jenkins-bot: Verified
diff --git a/ROADMAP.rst b/ROADMAP.rst
index 9f0b216..01e5632 100644
--- a/ROADMAP.rst
+++ b/ROADMAP.rst
@@ -1,6 +1,10 @@
Current release changes
^^^^^^^^^^^^^^^^^^^^^^^
+* add add_text function to textlib (T284388)
+* Require setuptools >= 49.4.0 (T284297)
+* Require wikitextparser>=0.47.5
+* Allow images to upload locally even they exist in the shared repository (T267535)
* Show a warning if pywikibot.__version__ is behind scripts.__version__ (T282766)
* Handle <ce>/<chem> tags as <math> aliases within textlib.replaceExcept() (T283990)
* Expand simulate query response for wikibase support (T76694)
diff --git a/pywikibot/specialbots/_upload.py b/pywikibot/specialbots/_upload.py
index cdc6c5b..92d9c4f 100644
--- a/pywikibot/specialbots/_upload.py
+++ b/pywikibot/specialbots/_upload.py
@@ -57,6 +57,8 @@
*Changed in version 6.2:* asynchronous upload is used if
*asynchronous* parameter is set.
+ *New in version 6.4:* force_if_shared parameter.
+
:param url: path to url or local file, or list of urls or paths
to local files.
:param description: Description of file for its page. If multiple files
diff --git a/pywikibot/textlib.py b/pywikibot/textlib.py
index 9deac38..4c301c0 100644
--- a/pywikibot/textlib.py
+++ b/pywikibot/textlib.py
@@ -835,6 +835,8 @@
def add_text(text: str, add: str, *, site=None) -> str:
"""Add text to a page content above categories and interwiki.
+ *New in version 6.4.*
+
:param text: The page content to add text to.
:param add: Text to add.
:param site: The site that the text is coming from. Required for
--
To view, visit https://gerrit.wikimedia.org/r/c/pywikibot/core/+/699483
To unsubscribe, or for help writing mail filters, visit https://gerrit.wikimedia.org/r/settings
Gerrit-Project: pywikibot/core
Gerrit-Branch: master
Gerrit-Change-Id: I4d60323be9c4bd0e6ace04201d99bc627af8886e
Gerrit-Change-Number: 699483
Gerrit-PatchSet: 2
Gerrit-Owner: Xqt <info(a)gno.de>
Gerrit-Reviewer: Xqt <info(a)gno.de>
Gerrit-Reviewer: jenkins-bot
Gerrit-MessageType: merged