jenkins-bot has submitted this change and it was merged. (
https://gerrit.wikimedia.org/r/364993 )
Change subject: Decommission rcstream
......................................................................
Decommission rcstream
rcstream is offline since 2017-07-10
- remove comms/rcstream.py
- change docs/api_ref/pywikibot.comms.rst which solves T168831
- change docs and README.rst
- add usage documentation to EventStreams
- update documentation in pagegenerators.py
- remove rcstreams_port and rcstreams_path methods from WikimediaFamily
and rename rcstreams_host to eventstreams_host
- remove dependency from setup.py
- skip doctest for eventstreams.py
- Add numpydoc and autosummary to enable section headers in docs
Bug: T170534
Bug: T168831
Change-Id: Ic5de5d07c5065c6c2759c7eef4fdb83ab10b8b6f
---
M docs/api_ref/pywikibot.comms.rst
M docs/conf.py
M docs/requirements-py3.txt
M pywikibot/README.rst
M pywikibot/comms/eventstreams.py
D pywikibot/comms/rcstream.py
M pywikibot/family.py
M pywikibot/pagegenerators.py
M setup.py
M tox.ini
10 files changed, 57 insertions(+), 262 deletions(-)
Approvals:
Krinkle: Looks good to me, but someone else must approve
Dalba: Looks good to me, approved
jenkins-bot: Verified
diff --git a/docs/api_ref/pywikibot.comms.rst b/docs/api_ref/pywikibot.comms.rst
index 2c25c5d..39ccc28 100644
--- a/docs/api_ref/pywikibot.comms.rst
+++ b/docs/api_ref/pywikibot.comms.rst
@@ -17,10 +17,10 @@
:undoc-members:
:show-inheritance:
-pywikibot.comms.rcstream module
+pywikibot.comms.eventstreams module
-------------------------------
-.. automodule:: pywikibot.comms.rcstream
+.. automodule:: pywikibot.comms.eventstreams
:members:
:undoc-members:
:show-inheritance:
diff --git a/docs/conf.py b/docs/conf.py
index 17cf9ac..04d20f7 100644
--- a/docs/conf.py
+++ b/docs/conf.py
@@ -1,7 +1,7 @@
# -*- coding: utf-8 -*-
"""Configuration file for Sphinx."""
#
-# (C) Pywikibot team, 2015-2016
+# (C) Pywikibot team, 2015-2017
#
# Distributed under the terms of the MIT license.
#
@@ -36,7 +36,9 @@
'sphinx_epytext',
'sphinx.ext.todo',
'sphinx.ext.coverage',
- 'sphinx.ext.viewcode']
+ 'sphinx.ext.viewcode',
+ 'sphinx.ext.autosummary',
+ 'numpydoc']
# Add any paths that contain templates here, relative to this directory.
templates_path = ['_templates']
diff --git a/docs/requirements-py3.txt b/docs/requirements-py3.txt
index c99a6f2..7403a65 100644
--- a/docs/requirements-py3.txt
+++ b/docs/requirements-py3.txt
@@ -4,4 +4,5 @@
sphinx==1.3.1
sphinx-epytext>=0.0.4
+numpydoc
diff --git a/pywikibot/README.rst b/pywikibot/README.rst
index 702f1a1..181ddd3 100644
--- a/pywikibot/README.rst
+++ b/pywikibot/README.rst
@@ -106,11 +106,9 @@
+---------------------------+-------------------------------------------------------+
| comms | Communication layer.
|
+===========================+=======================================================+
- | eventstreams.py | rcstream client for server sent events
|
+ | eventstreams.py | stream client for server sent events
|
+---------------------------+-------------------------------------------------------+
| http.py | Basic HTTP access interface
|
-
+---------------------------+-------------------------------------------------------+
- | rcstream.py | SocketIO-based rcstream client (deprecated)
|
+---------------------------+-------------------------------------------------------+
| threadedhttp.py | Httplib2 threaded cookie layer extending httplib2
|
+---------------------------+-------------------------------------------------------+
diff --git a/pywikibot/comms/eventstreams.py b/pywikibot/comms/eventstreams.py
index 2236481..5d4db87 100644
--- a/pywikibot/comms/eventstreams.py
+++ b/pywikibot/comms/eventstreams.py
@@ -38,6 +38,28 @@
It provides access to arbitrary streams of data including recent changes.
It replaces rcstream.py implementation.
+
+ Usage:
+
+ >>> stream = EventStreams(stream='recentchange')
+ >>> change = iter(stream).next()
+ >>> change
+ {'comment': '/* wbcreateclaim-create:1| */ [[Property:P31]]:
[[Q4167836]]',
+ 'wiki': 'wikidatawiki', 'type': 'edit',
'server_name': 'www.wikidata.org',
+ 'server_script_path': '/w', 'namespace': 0, 'title':
'Q32857263',
+ 'bot': True, 'server_url': 'https://www.wikidata.org',
+ 'length': {'new': 1223, 'old': 793},
+ 'meta': {'domain': 'www.wikidata.org', 'partition':
0,
+ 'uri': 'https://www.wikidata.org/wiki/Q32857263',
+ 'offset': 288986585, 'topic':
'eqiad.mediawiki.recentchange',
+ 'request_id': '1305a006-8204-4f51-a27b-0f2df58289f4',
+ 'schema_uri': 'mediawiki/recentchange/1',
+ 'dt': '2017-07-13T10:55:31+00:00',
+ 'id': 'ca13742b-67b9-11e7-935d-141877614a33'},
+ 'user': 'XXN-bot', 'timestamp': 1499943331,
'patrolled': True,
+ 'id': 551158959, 'minor': False,
+ 'revision': {'new': 518751558, 'old': 517180066}}
+ >>> del stream
"""
def __init__(self, **kwargs):
@@ -80,7 +102,7 @@
raise NotImplementedError(
'No stream specified for class {0}'
.format(self.__class__.__name__))
- self._url = ('{0}{1}/{2}'.format(self._site.rcstream_host(),
+ self._url = ('{0}{1}/{2}'.format(self._site.eventstreams_host(),
self._site.eventstreams_path(),
self._stream))
return self._url
@@ -106,6 +128,7 @@
Filter types
============
+
There are 3 types of filter: 'all', 'any' and 'none'.
The filter type must be given with the keyword argument 'ftype'
(see below). If no 'ftype' keyword argument is given, 'all' is
@@ -120,6 +143,7 @@
Filter functions
================
+
Filter may be specified as external function methods given as
positional argument like::
@@ -134,6 +158,7 @@
Filter keys and values
======================
+
Another method to register a filter is to pass pairs of keys and values
as keyword arguments to this method. The key must be a key of the event
data dict and the value must be any value or an iterable of values the
@@ -247,20 +272,11 @@
@type total: int
@return: pywikibot.comms.eventstream.rc_listener configured for given site
+ @raises ImportError: sseclient installation is required
"""
if isinstance(EventSource, Exception):
- warning('sseclient is required for EventStreams;\n'
- 'install it with "pip install sseclient"\n')
- # fallback to old rcstream method
- # NOTE: this will be deprecated soon
- from pywikibot.comms.rcstream import rc_listener
- return rc_listener(
- wikihost=site.hostname(),
- rchost=site.rcstream_host(),
- rcport=site.rcstream_port(),
- rcpath=site.rcstream_path(),
- total=total,
- )
+ raise ImportError('sseclient is required for EventStreams;\n'
+ 'install it with "pip install sseclient"\n')
stream = EventStreams(stream='recentchange', site=site)
stream.set_maximum_items(total)
diff --git a/pywikibot/comms/rcstream.py b/pywikibot/comms/rcstream.py
deleted file mode 100644
index a235f90..0000000
--- a/pywikibot/comms/rcstream.py
+++ /dev/null
@@ -1,226 +0,0 @@
-# -*- coding: utf-8 -*-
-"""
-SocketIO-based rcstream client.
-
-This file is part of the Pywikibot framework.
-
-This module requires socketIO_client to be installed:
- pip install socketIO_client
-"""
-#
-# (C) 2014 Merlijn van Deen
-# (C) Pywikibot team, 2014-2017
-#
-# Distributed under the terms of the MIT license.
-#
-from __future__ import absolute_import, unicode_literals
-
-__version__ = '$Id$'
-#
-
-import sys
-import threading
-
-if sys.version_info[0] > 2:
- from queue import Queue, Empty
-else:
- from Queue import Queue, Empty
-
-try:
- import socketIO_client
-except ImportError as e:
- socketIO_client = e
-
-from pywikibot.bot import debug, warning
-from pywikibot.tools import deprecated
-
-_logger = 'pywikibot.rcstream'
-
-
-class RcListenerThread(threading.Thread):
-
- """
- Low-level RC Listener Thread, pushing RC stream events into a queue.
-
- @param wikihost: the hostname of the wiki we want to get changes for. This
- is passed to rcstream using a 'subscribe' command. Pass
- '*' to listen to all wikis for a given rc host.
- @param rchost: the recent changes stream host to connect to. For Wikimedia
- wikis, this is 'https://stream.wikimedia.org'
- @param rcport: the port to connect to (default: 80)
- @param rcpath: the sockets.io path. For Wikimedia wikis, this is '/rc'.
- (default: '/rc')
- @param total: the maximum number of entries to return. The underlying
- thread is shut down then this number is reached.
-
- This part of the rc listener runs in a Thread. It makes the actual
- socketIO/websockets connection to the rc stream server, subscribes
- to a single site and pushes those entries into a queue.
-
- Usage:
-
- >>> t = RcListenerThread('en.wikipedia.org',
'https://stream.wikimedia.org')
- >>> t.start()
- >>> change = t.queue.get()
- >>> change
- {'server_name': 'en.wikipedia.org', 'wiki': 'enwiki',
'minor': True,
- 'length': {'new': 2038, 'old': 2009}, 'timestamp':
1419964350,
- 'server_script_path': '/w', 'bot': False, 'user':
'Od Mishehu',
- 'comment': 'stub sorting', 'title': 'Bradwell Bay
Wilderness',
- 'server_url': 'http://en.wikipedia.org', 'id': 703158386,
- 'revision': {'new': 640271171, 'old': 468264850},
- 'type': 'edit', 'namespace': 0}
- >>> t.stop() # optional, the thread will shut down on exiting python
- """
-
- def __init__(self, wikihost, rchost, rcport=80, rcpath='/rc', total=None):
- """Constructor for RcListenerThread."""
- super(RcListenerThread, self).__init__()
- self.rchost = rchost
- self.rcport = rcport
- self.rcpath = rcpath
- self.wikihost = wikihost
-
- self.daemon = True
- self.running = False
- self.queue = Queue()
-
- self.warn_queue_length = 100
-
- self.total = total
- self.count = 0
-
- debug('Opening connection to %r' % self, _logger)
- self.client = socketIO_client.SocketIO(rchost, rcport)
-
- thread = self
-
- class RCListener(socketIO_client.BaseNamespace):
- def on_change(self, change):
- debug('Received change %r' % change, _logger)
- if not thread.running:
- debug('Thread in shutdown mode; ignoring change.', _logger)
- return
-
- thread.count += 1
- thread.queue.put(change)
- if thread.queue.qsize() > thread.warn_queue_length:
- warning('%r queue length exceeded %i'
- % (thread,
- thread.warn_queue_length),
- _logger=_logger)
- thread.warn_queue_length = thread.warn_queue_length + 100
-
- if thread.total is not None and thread.count >= thread.total:
- thread.stop()
- return
-
- def on_connect(self):
- debug('Connected to %r; subscribing to %s'
- % (thread, thread.wikihost),
- _logger)
- self.emit('subscribe', thread.wikihost)
- debug('Subscribed to %s' % thread.wikihost, _logger)
-
- def on_reconnect(self):
- debug('Reconnected to %r' % (thread,), _logger)
- self.on_connect()
-
- class GlobalListener(socketIO_client.BaseNamespace):
- def on_heartbeat(self):
- self._transport.send_heartbeat()
-
- self.client.define(RCListener, rcpath)
- self.client.define(GlobalListener)
-
- def __repr__(self):
- """Return representation."""
- return "<rcstream for socketio://%s@%s:%s%s>" % (
- self.wikihost, self.rchost, self.rcport, self.rcpath
- )
-
- def run(self):
- """
- Threaded function.
-
- Runs inside the thread when started with .start().
- """
- self.running = True
- while self.running:
- self.client.wait(seconds=0.1)
- debug('Shut down event loop for %r' % self, _logger)
- self.client.disconnect()
- debug('Disconnected %r' % self, _logger)
- self.queue.put(None)
-
- def stop(self):
- """Stop the thread."""
- self.running = False
-
-
-def rc_listener(wikihost, rchost, rcport=80, rcpath='/rc', total=None):
- """Yield changes received from RCstream.
-
- @param wikihost: the hostname of the wiki we want to get changes for. This
- is passed to rcstream using a 'subscribe' command. Pass
- '*' to listen to all wikis for a given rc host.
- @param rchost: the recent changes stream host to connect to. For Wikimedia
- wikis, this is 'https://stream.wikimedia.org'
- @param rcport: the port to connect to (default: 80)
- @param rcpath: the sockets.io path. For Wikimedia wikis, this is '/rc'.
- (default: '/rc')
- @param total: the maximum number of entries to return. The underlying thread
- is shut down then this number is reached.
-
- @return: yield dict as formatted by MediaWiki's
- MachineReadableRCFeedFormatter, which consists of at least id
- (recent changes id), type ('edit', 'new', 'log' or
'external'),
- namespace, title, comment, timestamp, user and bot (bot flag for the
- change).
- @see:
U{MachineReadableRCFeedFormatter<https://doc.wikimedia.org/
- mediawiki-core/master/php/classMachineReadableRCFeedFormatter.html>}
- @rtype: generator
- @raises ImportError
- """
- if isinstance(socketIO_client, Exception):
- raise ImportError('socketIO_client is required for the rc stream;\n'
- 'install it with pip install
"socketIO_client==0.5.6"')
-
- rc_thread = RcListenerThread(
- wikihost=wikihost,
- rchost=rchost, rcport=rcport, rcpath=rcpath,
- total=total
- )
-
- debug('Starting rcstream thread %r' % rc_thread,
- _logger)
- rc_thread.start()
-
- while True:
- try:
- element = rc_thread.queue.get(timeout=0.1)
- except Empty:
- continue
- if element is None:
- return
- yield element
-
-
-(a)deprecated('eventstreams.site_rc_listener')
-def site_rc_listener(site, total=None):
- """Yield changes received from RCstream.
-
- @param site: the Pywikibot.Site object to yield live recent changes for
- @type site: Pywikibot.BaseSite
- @param total: the maximum number of changes to return
- @type total: int
-
- @return: pywikibot.comms.rcstream.rc_listener configured for the given site
- """
- return rc_listener(
- wikihost=site.hostname(),
- rchost=site.rcstream_host(),
- rcport=site.rcstream_port(),
- rcpath=site.rcstream_path(),
- total=total,
- )
diff --git a/pywikibot/family.py b/pywikibot/family.py
index 1ff3d7f..d5b8d6f 100644
--- a/pywikibot/family.py
+++ b/pywikibot/family.py
@@ -1144,12 +1144,19 @@
def rcstream_host(self, code):
"""Hostname for RCStream."""
- raise NotImplementedError(
- 'This family does support neither RCStream nor EventStreams')
+ raise NotImplementedError('This family does not support RCStream')
def rcstream_path(self, code):
"""Return path for RCStream."""
raise NotImplementedError("This family does not support RCStream")
+
+ def rcstream_port(self, code):
+ """Return port for RCStream."""
+ raise NotImplementedError('This family does not support RCStream')
+
+ def eventstreams_host(self, code):
+ """Hostname for EventStreams."""
+ raise NotImplementedError('This family does not support EventStreams')
def eventstreams_path(self, code):
"""Return path for EventStreams."""
@@ -1639,17 +1646,14 @@
"""Return 'https' as the protocol."""
return 'https'
+ @deprecated('eventstreams_host')
def rcstream_host(self, code):
- """Return 'https://stream.wikimedia.org' as the RCStream
hostname."""
+ """DEPRECATED: use eventstreams_host instead."""
+ return self.eventstreams_host(code)
+
+ def eventstreams_host(self, code):
+ """Return 'https://stream.wikimedia.org' as the stream
hostname."""
return 'https://stream.wikimedia.org'
-
- def rcstream_port(self, code):
- """Return 443 as the RCStream port number."""
- return 443
-
- def rcstream_path(self, code):
- """Return path for RCStream."""
- return '/rc'
def eventstreams_path(self, code):
"""Return path for EventStreams."""
diff --git a/pywikibot/pagegenerators.py b/pywikibot/pagegenerators.py
index 32c1c22..f125ab8 100644
--- a/pywikibot/pagegenerators.py
+++ b/pywikibot/pagegenerators.py
@@ -2403,10 +2403,12 @@
"""
Yield pages from a socket.io RC stream.
- Generates pages based on the socket.io recent changes stream.
+ Generates pages based on the EventStreams Server-Sent-Event (SSE) recent
+ changes stream.
The Page objects will have an extra property ._rcinfo containing the
literal rc data. This can be used to e.g. filter only new pages. See
- `pywikibot.comms.rcstream.rc_listener` for details on the .rcinfo format.
+ `pywikibot.comms.eventstreams.rc_listener` for details on the .rcinfo
+ format.
@param site: site to return recent changes for
@type site: pywikibot.BaseSite
diff --git a/setup.py b/setup.py
index cb74bca..1657ad5 100644
--- a/setup.py
+++ b/setup.py
@@ -61,8 +61,6 @@
'IRC': [irc_dep],
'mwparserfromhell': ['mwparserfromhell>=0.3.3'],
'Tkinter': ['Pillow<3.5.0' if PY26 else 'Pillow'],
- # 0.6.1 supports socket.io 1.0, but WMF is using 0.9 (T91393 and T85716)
- 'rcstream': ['socketIO-client<0.6.1'],
'security': ['requests[security]', 'pycparser!=2.14'],
'mwoauth': ['mwoauth>=0.2.4,!=0.3.1'],
'html': ['BeautifulSoup4'],
diff --git a/tox.ini b/tox.ini
index 724aa80..0b652a8 100644
--- a/tox.ini
+++ b/tox.ini
@@ -12,7 +12,7 @@
envlist = flake8,pyflakes-{py3,pypy}
[params]
-doctest_skip = --ignore-files=(gui\.py|botirc\.py|rcstream\.py)
+doctest_skip = --ignore-files=(gui\.py|botirc\.py|eventstreams\.py)
[testenv]
setenv =
--
To view, visit
https://gerrit.wikimedia.org/r/364993
To unsubscribe, visit
https://gerrit.wikimedia.org/r/settings
Gerrit-MessageType: merged
Gerrit-Change-Id: Ic5de5d07c5065c6c2759c7eef4fdb83ab10b8b6f
Gerrit-PatchSet: 12
Gerrit-Project: pywikibot/core
Gerrit-Branch: master
Gerrit-Owner: Xqt <info(a)gno.de>
Gerrit-Reviewer: Dalba <dalba.wiki(a)gmail.com>
Gerrit-Reviewer: John Vandenberg <jayvdb(a)gmail.com>
Gerrit-Reviewer: Krinkle <krinklemail(a)gmail.com>
Gerrit-Reviewer: Lokal Profil <lokal.profil(a)gmail.com>
Gerrit-Reviewer: Magul <tomasz.magulski(a)gmail.com>
Gerrit-Reviewer: Merlijn van Deen <valhallasw(a)arctus.nl>
Gerrit-Reviewer: Mpaa <mpaa.wiki(a)gmail.com>
Gerrit-Reviewer: Xqt <info(a)gno.de>
Gerrit-Reviewer: jenkins-bot <>