jenkins-bot has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/364993 )
Change subject: Decommission rcstream ......................................................................
Decommission rcstream
rcstream is offline since 2017-07-10
- remove comms/rcstream.py - change docs/api_ref/pywikibot.comms.rst which solves T168831 - change docs and README.rst - add usage documentation to EventStreams - update documentation in pagegenerators.py - remove rcstreams_port and rcstreams_path methods from WikimediaFamily and rename rcstreams_host to eventstreams_host - remove dependency from setup.py - skip doctest for eventstreams.py - Add numpydoc and autosummary to enable section headers in docs
Bug: T170534 Bug: T168831 Change-Id: Ic5de5d07c5065c6c2759c7eef4fdb83ab10b8b6f --- M docs/api_ref/pywikibot.comms.rst M docs/conf.py M docs/requirements-py3.txt M pywikibot/README.rst M pywikibot/comms/eventstreams.py D pywikibot/comms/rcstream.py M pywikibot/family.py M pywikibot/pagegenerators.py M setup.py M tox.ini 10 files changed, 57 insertions(+), 262 deletions(-)
Approvals: Krinkle: Looks good to me, but someone else must approve Dalba: Looks good to me, approved jenkins-bot: Verified
diff --git a/docs/api_ref/pywikibot.comms.rst b/docs/api_ref/pywikibot.comms.rst index 2c25c5d..39ccc28 100644 --- a/docs/api_ref/pywikibot.comms.rst +++ b/docs/api_ref/pywikibot.comms.rst @@ -17,10 +17,10 @@ :undoc-members: :show-inheritance:
-pywikibot.comms.rcstream module +pywikibot.comms.eventstreams module -------------------------------
-.. automodule:: pywikibot.comms.rcstream +.. automodule:: pywikibot.comms.eventstreams :members: :undoc-members: :show-inheritance: diff --git a/docs/conf.py b/docs/conf.py index 17cf9ac..04d20f7 100644 --- a/docs/conf.py +++ b/docs/conf.py @@ -1,7 +1,7 @@ # -*- coding: utf-8 -*- """Configuration file for Sphinx.""" # -# (C) Pywikibot team, 2015-2016 +# (C) Pywikibot team, 2015-2017 # # Distributed under the terms of the MIT license. # @@ -36,7 +36,9 @@ 'sphinx_epytext', 'sphinx.ext.todo', 'sphinx.ext.coverage', - 'sphinx.ext.viewcode'] + 'sphinx.ext.viewcode', + 'sphinx.ext.autosummary', + 'numpydoc']
# Add any paths that contain templates here, relative to this directory. templates_path = ['_templates'] diff --git a/docs/requirements-py3.txt b/docs/requirements-py3.txt index c99a6f2..7403a65 100644 --- a/docs/requirements-py3.txt +++ b/docs/requirements-py3.txt @@ -4,4 +4,5 @@
sphinx==1.3.1 sphinx-epytext>=0.0.4 +numpydoc
diff --git a/pywikibot/README.rst b/pywikibot/README.rst index 702f1a1..181ddd3 100644 --- a/pywikibot/README.rst +++ b/pywikibot/README.rst @@ -106,11 +106,9 @@ +---------------------------+-------------------------------------------------------+ | comms | Communication layer. | +===========================+=======================================================+ - | eventstreams.py | rcstream client for server sent events | + | eventstreams.py | stream client for server sent events | +---------------------------+-------------------------------------------------------+ | http.py | Basic HTTP access interface | - +---------------------------+-------------------------------------------------------+ - | rcstream.py | SocketIO-based rcstream client (deprecated) | +---------------------------+-------------------------------------------------------+ | threadedhttp.py | Httplib2 threaded cookie layer extending httplib2 | +---------------------------+-------------------------------------------------------+ diff --git a/pywikibot/comms/eventstreams.py b/pywikibot/comms/eventstreams.py index 2236481..5d4db87 100644 --- a/pywikibot/comms/eventstreams.py +++ b/pywikibot/comms/eventstreams.py @@ -38,6 +38,28 @@
It provides access to arbitrary streams of data including recent changes. It replaces rcstream.py implementation. + + Usage: + + >>> stream = EventStreams(stream='recentchange') + >>> change = iter(stream).next() + >>> change + {'comment': '/* wbcreateclaim-create:1| */ [[Property:P31]]: [[Q4167836]]', + 'wiki': 'wikidatawiki', 'type': 'edit', 'server_name': 'www.wikidata.org', + 'server_script_path': '/w', 'namespace': 0, 'title': 'Q32857263', + 'bot': True, 'server_url': 'https://www.wikidata.org', + 'length': {'new': 1223, 'old': 793}, + 'meta': {'domain': 'www.wikidata.org', 'partition': 0, + 'uri': 'https://www.wikidata.org/wiki/Q32857263', + 'offset': 288986585, 'topic': 'eqiad.mediawiki.recentchange', + 'request_id': '1305a006-8204-4f51-a27b-0f2df58289f4', + 'schema_uri': 'mediawiki/recentchange/1', + 'dt': '2017-07-13T10:55:31+00:00', + 'id': 'ca13742b-67b9-11e7-935d-141877614a33'}, + 'user': 'XXN-bot', 'timestamp': 1499943331, 'patrolled': True, + 'id': 551158959, 'minor': False, + 'revision': {'new': 518751558, 'old': 517180066}} + >>> del stream """
def __init__(self, **kwargs): @@ -80,7 +102,7 @@ raise NotImplementedError( 'No stream specified for class {0}' .format(self.__class__.__name__)) - self._url = ('{0}{1}/{2}'.format(self._site.rcstream_host(), + self._url = ('{0}{1}/{2}'.format(self._site.eventstreams_host(), self._site.eventstreams_path(), self._stream)) return self._url @@ -106,6 +128,7 @@
Filter types ============ + There are 3 types of filter: 'all', 'any' and 'none'. The filter type must be given with the keyword argument 'ftype' (see below). If no 'ftype' keyword argument is given, 'all' is @@ -120,6 +143,7 @@
Filter functions ================ + Filter may be specified as external function methods given as positional argument like::
@@ -134,6 +158,7 @@
Filter keys and values ====================== + Another method to register a filter is to pass pairs of keys and values as keyword arguments to this method. The key must be a key of the event data dict and the value must be any value or an iterable of values the @@ -247,20 +272,11 @@ @type total: int
@return: pywikibot.comms.eventstream.rc_listener configured for given site + @raises ImportError: sseclient installation is required """ if isinstance(EventSource, Exception): - warning('sseclient is required for EventStreams;\n' - 'install it with "pip install sseclient"\n') - # fallback to old rcstream method - # NOTE: this will be deprecated soon - from pywikibot.comms.rcstream import rc_listener - return rc_listener( - wikihost=site.hostname(), - rchost=site.rcstream_host(), - rcport=site.rcstream_port(), - rcpath=site.rcstream_path(), - total=total, - ) + raise ImportError('sseclient is required for EventStreams;\n' + 'install it with "pip install sseclient"\n')
stream = EventStreams(stream='recentchange', site=site) stream.set_maximum_items(total) diff --git a/pywikibot/comms/rcstream.py b/pywikibot/comms/rcstream.py deleted file mode 100644 index a235f90..0000000 --- a/pywikibot/comms/rcstream.py +++ /dev/null @@ -1,226 +0,0 @@ -# -*- coding: utf-8 -*- -""" -SocketIO-based rcstream client. - -This file is part of the Pywikibot framework. - -This module requires socketIO_client to be installed: - pip install socketIO_client -""" -# -# (C) 2014 Merlijn van Deen -# (C) Pywikibot team, 2014-2017 -# -# Distributed under the terms of the MIT license. -# -from __future__ import absolute_import, unicode_literals - -__version__ = '$Id$' -# - -import sys -import threading - -if sys.version_info[0] > 2: - from queue import Queue, Empty -else: - from Queue import Queue, Empty - -try: - import socketIO_client -except ImportError as e: - socketIO_client = e - -from pywikibot.bot import debug, warning -from pywikibot.tools import deprecated - -_logger = 'pywikibot.rcstream' - - -class RcListenerThread(threading.Thread): - - """ - Low-level RC Listener Thread, pushing RC stream events into a queue. - - @param wikihost: the hostname of the wiki we want to get changes for. This - is passed to rcstream using a 'subscribe' command. Pass - '*' to listen to all wikis for a given rc host. - @param rchost: the recent changes stream host to connect to. For Wikimedia - wikis, this is 'https://stream.wikimedia.org' - @param rcport: the port to connect to (default: 80) - @param rcpath: the sockets.io path. For Wikimedia wikis, this is '/rc'. - (default: '/rc') - @param total: the maximum number of entries to return. The underlying - thread is shut down then this number is reached. - - This part of the rc listener runs in a Thread. It makes the actual - socketIO/websockets connection to the rc stream server, subscribes - to a single site and pushes those entries into a queue. - - Usage: - - >>> t = RcListenerThread('en.wikipedia.org', 'https://stream.wikimedia.org') - >>> t.start() - >>> change = t.queue.get() - >>> change - {'server_name': 'en.wikipedia.org', 'wiki': 'enwiki', 'minor': True, - 'length': {'new': 2038, 'old': 2009}, 'timestamp': 1419964350, - 'server_script_path': '/w', 'bot': False, 'user': 'Od Mishehu', - 'comment': 'stub sorting', 'title': 'Bradwell Bay Wilderness', - 'server_url': 'http://en.wikipedia.org', 'id': 703158386, - 'revision': {'new': 640271171, 'old': 468264850}, - 'type': 'edit', 'namespace': 0} - >>> t.stop() # optional, the thread will shut down on exiting python - """ - - def __init__(self, wikihost, rchost, rcport=80, rcpath='/rc', total=None): - """Constructor for RcListenerThread.""" - super(RcListenerThread, self).__init__() - self.rchost = rchost - self.rcport = rcport - self.rcpath = rcpath - self.wikihost = wikihost - - self.daemon = True - self.running = False - self.queue = Queue() - - self.warn_queue_length = 100 - - self.total = total - self.count = 0 - - debug('Opening connection to %r' % self, _logger) - self.client = socketIO_client.SocketIO(rchost, rcport) - - thread = self - - class RCListener(socketIO_client.BaseNamespace): - def on_change(self, change): - debug('Received change %r' % change, _logger) - if not thread.running: - debug('Thread in shutdown mode; ignoring change.', _logger) - return - - thread.count += 1 - thread.queue.put(change) - if thread.queue.qsize() > thread.warn_queue_length: - warning('%r queue length exceeded %i' - % (thread, - thread.warn_queue_length), - _logger=_logger) - thread.warn_queue_length = thread.warn_queue_length + 100 - - if thread.total is not None and thread.count >= thread.total: - thread.stop() - return - - def on_connect(self): - debug('Connected to %r; subscribing to %s' - % (thread, thread.wikihost), - _logger) - self.emit('subscribe', thread.wikihost) - debug('Subscribed to %s' % thread.wikihost, _logger) - - def on_reconnect(self): - debug('Reconnected to %r' % (thread,), _logger) - self.on_connect() - - class GlobalListener(socketIO_client.BaseNamespace): - def on_heartbeat(self): - self._transport.send_heartbeat() - - self.client.define(RCListener, rcpath) - self.client.define(GlobalListener) - - def __repr__(self): - """Return representation.""" - return "<rcstream for socketio://%s@%s:%s%s>" % ( - self.wikihost, self.rchost, self.rcport, self.rcpath - ) - - def run(self): - """ - Threaded function. - - Runs inside the thread when started with .start(). - """ - self.running = True - while self.running: - self.client.wait(seconds=0.1) - debug('Shut down event loop for %r' % self, _logger) - self.client.disconnect() - debug('Disconnected %r' % self, _logger) - self.queue.put(None) - - def stop(self): - """Stop the thread.""" - self.running = False - - -def rc_listener(wikihost, rchost, rcport=80, rcpath='/rc', total=None): - """Yield changes received from RCstream. - - @param wikihost: the hostname of the wiki we want to get changes for. This - is passed to rcstream using a 'subscribe' command. Pass - '*' to listen to all wikis for a given rc host. - @param rchost: the recent changes stream host to connect to. For Wikimedia - wikis, this is 'https://stream.wikimedia.org' - @param rcport: the port to connect to (default: 80) - @param rcpath: the sockets.io path. For Wikimedia wikis, this is '/rc'. - (default: '/rc') - @param total: the maximum number of entries to return. The underlying thread - is shut down then this number is reached. - - @return: yield dict as formatted by MediaWiki's - MachineReadableRCFeedFormatter, which consists of at least id - (recent changes id), type ('edit', 'new', 'log' or 'external'), - namespace, title, comment, timestamp, user and bot (bot flag for the - change). - @see: U{MachineReadableRCFeedFormatter<https://doc.wikimedia.org/ - mediawiki-core/master/php/classMachineReadableRCFeedFormatter.html>} - @rtype: generator - @raises ImportError - """ - if isinstance(socketIO_client, Exception): - raise ImportError('socketIO_client is required for the rc stream;\n' - 'install it with pip install "socketIO_client==0.5.6"') - - rc_thread = RcListenerThread( - wikihost=wikihost, - rchost=rchost, rcport=rcport, rcpath=rcpath, - total=total - ) - - debug('Starting rcstream thread %r' % rc_thread, - _logger) - rc_thread.start() - - while True: - try: - element = rc_thread.queue.get(timeout=0.1) - except Empty: - continue - if element is None: - return - yield element - - -@deprecated('eventstreams.site_rc_listener') -def site_rc_listener(site, total=None): - """Yield changes received from RCstream. - - @param site: the Pywikibot.Site object to yield live recent changes for - @type site: Pywikibot.BaseSite - @param total: the maximum number of changes to return - @type total: int - - @return: pywikibot.comms.rcstream.rc_listener configured for the given site - """ - return rc_listener( - wikihost=site.hostname(), - rchost=site.rcstream_host(), - rcport=site.rcstream_port(), - rcpath=site.rcstream_path(), - total=total, - ) diff --git a/pywikibot/family.py b/pywikibot/family.py index 1ff3d7f..d5b8d6f 100644 --- a/pywikibot/family.py +++ b/pywikibot/family.py @@ -1144,12 +1144,19 @@
def rcstream_host(self, code): """Hostname for RCStream.""" - raise NotImplementedError( - 'This family does support neither RCStream nor EventStreams') + raise NotImplementedError('This family does not support RCStream')
def rcstream_path(self, code): """Return path for RCStream.""" raise NotImplementedError("This family does not support RCStream") + + def rcstream_port(self, code): + """Return port for RCStream.""" + raise NotImplementedError('This family does not support RCStream') + + def eventstreams_host(self, code): + """Hostname for EventStreams.""" + raise NotImplementedError('This family does not support EventStreams')
def eventstreams_path(self, code): """Return path for EventStreams.""" @@ -1639,17 +1646,14 @@ """Return 'https' as the protocol.""" return 'https'
+ @deprecated('eventstreams_host') def rcstream_host(self, code): - """Return 'https://stream.wikimedia.org' as the RCStream hostname.""" + """DEPRECATED: use eventstreams_host instead.""" + return self.eventstreams_host(code) + + def eventstreams_host(self, code): + """Return 'https://stream.wikimedia.org' as the stream hostname.""" return 'https://stream.wikimedia.org' - - def rcstream_port(self, code): - """Return 443 as the RCStream port number.""" - return 443 - - def rcstream_path(self, code): - """Return path for RCStream.""" - return '/rc'
def eventstreams_path(self, code): """Return path for EventStreams.""" diff --git a/pywikibot/pagegenerators.py b/pywikibot/pagegenerators.py index 32c1c22..f125ab8 100644 --- a/pywikibot/pagegenerators.py +++ b/pywikibot/pagegenerators.py @@ -2403,10 +2403,12 @@ """ Yield pages from a socket.io RC stream.
- Generates pages based on the socket.io recent changes stream. + Generates pages based on the EventStreams Server-Sent-Event (SSE) recent + changes stream. The Page objects will have an extra property ._rcinfo containing the literal rc data. This can be used to e.g. filter only new pages. See - `pywikibot.comms.rcstream.rc_listener` for details on the .rcinfo format. + `pywikibot.comms.eventstreams.rc_listener` for details on the .rcinfo + format.
@param site: site to return recent changes for @type site: pywikibot.BaseSite diff --git a/setup.py b/setup.py index cb74bca..1657ad5 100644 --- a/setup.py +++ b/setup.py @@ -61,8 +61,6 @@ 'IRC': [irc_dep], 'mwparserfromhell': ['mwparserfromhell>=0.3.3'], 'Tkinter': ['Pillow<3.5.0' if PY26 else 'Pillow'], - # 0.6.1 supports socket.io 1.0, but WMF is using 0.9 (T91393 and T85716) - 'rcstream': ['socketIO-client<0.6.1'], 'security': ['requests[security]', 'pycparser!=2.14'], 'mwoauth': ['mwoauth>=0.2.4,!=0.3.1'], 'html': ['BeautifulSoup4'], diff --git a/tox.ini b/tox.ini index 724aa80..0b652a8 100644 --- a/tox.ini +++ b/tox.ini @@ -12,7 +12,7 @@ envlist = flake8,pyflakes-{py3,pypy}
[params] -doctest_skip = --ignore-files=(gui.py|botirc.py|rcstream.py) +doctest_skip = --ignore-files=(gui.py|botirc.py|eventstreams.py)
[testenv] setenv =