jenkins-bot has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/326923 )
Change subject: site.py: Adding new warnings 'nochange' and 'duplicateversions'
......................................................................
site.py: Adding new warnings 'nochange' and 'duplicateversions'
It appears only when the uploaded file is duplicate with the
current version or old versions of the file under the same
filename.
The messages are from MediaWiki i18n/en.json
Bug: T153060
Change-Id: I6008bf855d4cbddabcdbd30d9cc9cafd7e87f7c5
---
M pywikibot/site.py
1 file changed, 6 insertions(+), 0 deletions(-)
Approvals:
Magul: Looks good to me, approved
jenkins-bot: Verified
diff --git a/pywikibot/site.py b/pywikibot/site.py
index 5db67aa..92245fd 100644
--- a/pywikibot/site.py
+++ b/pywikibot/site.py
@@ -5930,6 +5930,12 @@
'"%(msg)s".',
'bad-prefix': 'Target filename has a bad prefix %(msg)s.',
'page-exists': 'Target filename exists but with a different file %(msg)s.',
+
+ # API-returned message string will be timestamps, not much use here
+ 'nochange': 'The upload is an exact duplicate of the current version of '
+ 'this file.',
+ 'duplicateversions': 'The upload is an exact duplicate of older '
+ 'version(s) of this file.',
}
# An offset != 0 doesn't make sense without a file key
--
To view, visit https://gerrit.wikimedia.org/r/326923
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings
Gerrit-MessageType: merged
Gerrit-Change-Id: I6008bf855d4cbddabcdbd30d9cc9cafd7e87f7c5
Gerrit-PatchSet: 1
Gerrit-Project: pywikibot/core
Gerrit-Branch: master
Gerrit-Owner: Zhuyifei1999 <zhuyifei1999(a)gmail.com>
Gerrit-Reviewer: Magul <tomasz.magulski(a)gmail.com>
Gerrit-Reviewer: jenkins-bot <>
jenkins-bot has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/270577 )
Change subject: [IMPR] Provide a private user script path to be used by pwb.py
......................................................................
[IMPR] Provide a private user script path to be used by pwb.py
- A privat script path must be located inside the framework folder.
The user_script_path will searched first.
- Print a UserWaring when user_script_path is not a list or not a tuple
- generate_user_files.py may copy this setting part to user-config
Change-Id: I3191a91b32c15cb69b77e3d0c9ccfd402cc3f948
---
M generate_user_files.py
M pwb.py
M pywikibot/config2.py
3 files changed, 28 insertions(+), 3 deletions(-)
Approvals:
Merlijn van Deen: Looks good to me, approved
jenkins-bot: Verified
diff --git a/generate_user_files.py b/generate_user_files.py
index 8e59c8b..3343985 100755
--- a/generate_user_files.py
+++ b/generate_user_files.py
@@ -2,7 +2,7 @@
# -*- coding: utf-8 -*-
"""Script to create user-config.py."""
#
-# (C) Pywikibot team, 2010-2015
+# (C) Pywikibot team, 2010-2016
#
# Distributed under the terms of the MIT license.
#
@@ -255,6 +255,7 @@
res = re.findall("^(# ############# (?:"
"LOGFILE|"
+ 'EXTERNAL SCRIPT PATH|'
"INTERWIKI|"
"SOLVE_DISAMBIGUATION|"
"IMAGE RELATED|"
diff --git a/pwb.py b/pwb.py
index 96de789..dc015b7 100755
--- a/pwb.py
+++ b/pwb.py
@@ -9,7 +9,7 @@
and it will use the package directory to store all user files, will fix up
search paths so the package does not need to be installed, etc.
"""
-# (C) Pywikibot team, 2015
+# (C) Pywikibot team, 2015-2016
#
# Distributed under the terms of the MIT license.
#
@@ -212,6 +212,14 @@
script_paths = ['scripts',
'scripts.maintenance',
'scripts.archive']
+ from pywikibot import config # flake8: disable=E402
+ if config.user_script_paths:
+ if isinstance(config.user_script_paths, (tuple, list)):
+ script_paths = config.user_script_paths + script_paths
+ else:
+ warn("'user_script_paths' must be a list or tuple,\n"
+ 'found: {0}. Ignoring this setting.'
+ ''.format(type(config.user_script_paths)))
for file_package in script_paths:
paths = file_package.split('.') + [filename]
testpath = os.path.join(_pwb_dir, *paths)
diff --git a/pywikibot/config2.py b/pywikibot/config2.py
index a1aa14f..9f4d72e 100644
--- a/pywikibot/config2.py
+++ b/pywikibot/config2.py
@@ -33,7 +33,7 @@
"""
#
# (C) Rob W.W. Hooft, 2003
-# (C) Pywikibot team, 2003-2015
+# (C) Pywikibot team, 2003-2016
#
# Distributed under the terms of the MIT license.
#
@@ -516,6 +516,22 @@
# (overrides log setting above)
debug_log = []
+# ############# EXTERNAL SCRIPT PATH SETTING ##############
+# set your own script path to lookup for your script files.
+# your private script path must be located inside the
+# framework folder, subfolders must be delimited by '.'.
+# every folder must contain an (empty) __init__.py file.
+#
+# The search order is
+# 1. user_script_paths in the given order
+# 2. scripts
+# 3. scripts/maintenance
+# 4. scripts/archive
+#
+# sample:
+# user_script_paths = ['scripts.myscripts']
+user_script_paths = []
+
# ############# INTERWIKI SETTINGS ##############
# Should interwiki.py report warnings for missing links between foreign
--
To view, visit https://gerrit.wikimedia.org/r/270577
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings
Gerrit-MessageType: merged
Gerrit-Change-Id: I3191a91b32c15cb69b77e3d0c9ccfd402cc3f948
Gerrit-PatchSet: 6
Gerrit-Project: pywikibot/core
Gerrit-Branch: master
Gerrit-Owner: Xqt <info(a)gno.de>
Gerrit-Reviewer: John Vandenberg <jayvdb(a)gmail.com>
Gerrit-Reviewer: Ladsgroup <Ladsgroup(a)gmail.com>
Gerrit-Reviewer: Merlijn van Deen <valhallasw(a)arctus.nl>
Gerrit-Reviewer: Xqt <info(a)gno.de>
Gerrit-Reviewer: jenkins-bot <>
jenkins-bot has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/326313 )
Change subject: site_detect.py: Raise ServerError on 500 response
......................................................................
site_detect.py: Raise ServerError on 500 response
A test for wikichristian.org is occasionally returning 500 error codes and
making travis builds fail.
Raising a ServerError here will make that test skip.
Bug: T151368
Change-Id: I9e0c5d64c28f8d4baf860541ad518c4ae861177d
---
M pywikibot/site_detect.py
1 file changed, 2 insertions(+), 0 deletions(-)
Approvals:
Magul: Looks good to me, approved
jenkins-bot: Verified
diff --git a/pywikibot/site_detect.py b/pywikibot/site_detect.py
index cc69e4b..4eb78d7 100644
--- a/pywikibot/site_detect.py
+++ b/pywikibot/site_detect.py
@@ -54,6 +54,8 @@
r = fetch(fromurl)
if r.status == 503:
raise ServerError('Service Unavailable')
+ elif r.status == 500:
+ raise ServerError('Internal Server Error')
if fromurl != r.data.url:
pywikibot.log('{0} redirected to {1}'.format(fromurl, r.data.url))
--
To view, visit https://gerrit.wikimedia.org/r/326313
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings
Gerrit-MessageType: merged
Gerrit-Change-Id: I9e0c5d64c28f8d4baf860541ad518c4ae861177d
Gerrit-PatchSet: 2
Gerrit-Project: pywikibot/core
Gerrit-Branch: master
Gerrit-Owner: Dalba <dalba.wiki(a)gmail.com>
Gerrit-Reviewer: Magul <tomasz.magulski(a)gmail.com>
Gerrit-Reviewer: jenkins-bot <>
jenkins-bot has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/324150 )
Change subject: Remove compat stuff "via API" from user messages
......................................................................
Remove compat stuff "via API" from user messages
- all pages are retrieved via API in core. It is not necessary to show that.
This was a relic from compat which had a screen scraping variant.
Change-Id: I63e236390ac83221fe6e1b61bcd67ab9bdb8caf1
---
M scripts/redirect.py
M scripts/watchlist.py
2 files changed, 4 insertions(+), 4 deletions(-)
Approvals:
jenkins-bot: Verified
Sn1per: Looks good to me, approved
diff --git a/scripts/redirect.py b/scripts/redirect.py
index abe9b34..2c9fadb 100755
--- a/scripts/redirect.py
+++ b/scripts/redirect.py
@@ -342,9 +342,9 @@
datetime.timedelta(0, self.offset * 3600))
# self.offset hours ago
offset_time = start.strftime("%Y%m%d%H%M%S")
- pywikibot.output(u'Retrieving %s moved pages via API...'
- % (str(self.api_number)
- if self.api_number is not None else "all"))
+ pywikibot.output('Retrieving {0} moved pages...'
+ ''.format(str(self.api_number)
+ if self.api_number is not None else 'all'))
move_gen = self.site.logevents(logtype="move", start=offset_time)
if self.api_number:
move_gen.set_maximum_items(self.api_number)
diff --git a/scripts/watchlist.py b/scripts/watchlist.py
index 316ed70..931a6be 100755
--- a/scripts/watchlist.py
+++ b/scripts/watchlist.py
@@ -53,7 +53,7 @@
def refresh(site, sysop=False):
"""Fetch the watchlist."""
- pywikibot.output(u'Retrieving watchlist for %s via API.' % str(site))
+ pywikibot.output('Retrieving watchlist for {0}.'.format(str(site)))
return list(site.watched_pages(sysop=sysop, force=True))
--
To view, visit https://gerrit.wikimedia.org/r/324150
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings
Gerrit-MessageType: merged
Gerrit-Change-Id: I63e236390ac83221fe6e1b61bcd67ab9bdb8caf1
Gerrit-PatchSet: 1
Gerrit-Project: pywikibot/core
Gerrit-Branch: master
Gerrit-Owner: Xqt <info(a)gno.de>
Gerrit-Reviewer: John Vandenberg <jayvdb(a)gmail.com>
Gerrit-Reviewer: Sn1per <geofbot(a)gmail.com>
Gerrit-Reviewer: jenkins-bot <>
jenkins-bot has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/326245 )
Change subject: Document test decorators.
......................................................................
Document test decorators.
Documentation regarding mock.patch, tests.aspects.require_modules,
unittest.skipIf and unittest.skipUnless is added to the README file.
Bug: T152068
Change-Id: I335323a1704a9912dbc39bcec8042ae8ee706234
---
M tests/README.rst
1 file changed, 62 insertions(+), 1 deletion(-)
Approvals:
John Vandenberg: Looks good to me, approved
jenkins-bot: Verified
diff --git a/tests/README.rst b/tests/README.rst
index fff848f..90f88db 100644
--- a/tests/README.rst
+++ b/tests/README.rst
@@ -226,6 +226,68 @@
Enabling only 'edit failure' tests or 'write' tests won't enable the other tests
automatically.
+Decorators
+=====================
+
+pywikibot's test suite, including Python's unittest module, provides decorators
+to modify the behaviour of the test cases.
+
+(a)unittest.skipIf
+-----------------
+Skip a test if the condition is true. Refer to unittest's documentation.
+
+::
+
+ import unittest
+ [......]
+ @unittest.skipIf(check_if_fatal(), 'Something is not okay.')
+ def test_skipIf(self):
+
+(a)unittest.skipUnless
+---------------------
+Skip a test unless the condition is true. Refer to unittest's documentation.
+
+::
+
+ import unittest
+ [......]
+ @unittest.skipUnless(check_if_true(), 'Something must happen.')
+ def test_skipUnless(self):
+
+(a)tests.aspects.require_modules
+-------------------------------
+Require that the given list of modules can be imported.
+
+::
+
+ from tests.aspects import require_modules
+ [......]
+ @require_modules(['important1', 'musthave2'])
+ def test_require_modules(self):
+
+(a)(unittest.)mock.patch
+-----------------------
+Replaces `target` with object specified in `new`. Refer to mock's documentation.
+This is especially useful in tests, where requests to third-parties should be
+avoided.
+
+In Python 3, this is part of the built-in unittest module.
+
+::
+
+ if sys.version_info[0] > 2:
+ from unittest.mock import patch
+ else:
+ from mock import patch
+
+
+ def fake_ping(url):
+ return 'pong'
+ [......]
+ @patch('http_ping', side_effect=fake_ping)
+ def test_patch(self):
+ self.assertEqual('pong', http_ping())
+
Contributing tests
==================
@@ -293,4 +355,3 @@
- ``user = True`` : test class needs to login to site
- ``sysop = True`` : test class needs to login to site as a sysop
- ``write = True`` : test class needs to write to a site
-
--
To view, visit https://gerrit.wikimedia.org/r/326245
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings
Gerrit-MessageType: merged
Gerrit-Change-Id: I335323a1704a9912dbc39bcec8042ae8ee706234
Gerrit-PatchSet: 1
Gerrit-Project: pywikibot/core
Gerrit-Branch: master
Gerrit-Owner: Dargasia <hxiao+scm(a)dargasea.com>
Gerrit-Reviewer: John Vandenberg <jayvdb(a)gmail.com>
Gerrit-Reviewer: jenkins-bot <>
jenkins-bot has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/323518 )
Change subject: login.py: Support specifying '*' for 'family' in user-config.py
......................................................................
login.py: Support specifying '*' for 'family' in user-config.py
- __init__.py:
- Check the username and sysopname of family '*' if none is found
in the specified family.
- login_tests.py:
- Use "import unittest.mock" instead of "import mock" when available.
(mock has been part of the standard library since python 3.3)
- FakeConfig.usernames should be a defaultdict.
- Add a test for the new funtionality of LoginManager.
Bug: T120334
Change-Id: I23749a4035c7d27186a92e67c6d6206e10326ff0
---
M pywikibot/__init__.py
M pywikibot/config2.py
M pywikibot/login.py
M tests/login_tests.py
4 files changed, 49 insertions(+), 17 deletions(-)
Approvals:
jenkins-bot: Verified
Sn1per: Looks good to me, approved
diff --git a/pywikibot/__init__.py b/pywikibot/__init__.py
index 42b93ec..c68fcfe 100644
--- a/pywikibot/__init__.py
+++ b/pywikibot/__init__.py
@@ -825,13 +825,16 @@
interface = interface or fam.interface(code)
- # config.usernames is initialised with a dict for each family name
+ # config.usernames is initialised with a defaultdict for each family name
family_name = str(fam)
- if family_name in config.usernames:
- user = user or config.usernames[family_name].get(code) \
- or config.usernames[family_name].get('*')
- sysop = sysop or config.sysopnames[family_name].get(code) \
- or config.sysopnames[family_name].get('*')
+
+ code_to_user = config.usernames['*'].copy()
+ code_to_user.update(config.usernames[family_name])
+ user = user or code_to_user.get(code) or code_to_user.get('*')
+
+ code_to_sysop = config.sysopnames['*'].copy()
+ code_to_sysop.update(config.sysopnames[family_name])
+ sysop = sysop or code_to_sysop.get(code) or code_to_sysop.get('*')
if not isinstance(interface, type):
# If it isnt a class, assume it is a string
diff --git a/pywikibot/config2.py b/pywikibot/config2.py
index 9451eb5..e1ab7dc 100644
--- a/pywikibot/config2.py
+++ b/pywikibot/config2.py
@@ -116,6 +116,7 @@
# If you have a unique username for all languages of a family,
# you can use '*'
# usernames['wikibooks']['*'] = 'mySingleUsername'
+# You may use '*' for family name in a similar manner.
#
# If you have a sysop account on some wikis, this will be used to delete pages
# or to edit locked pages if you add such lines to your
diff --git a/pywikibot/login.py b/pywikibot/login.py
index e110c47..01ebf6b 100644
--- a/pywikibot/login.py
+++ b/pywikibot/login.py
@@ -91,9 +91,12 @@
if user:
self.username = user
elif sysop:
+ config_names = config.sysopnames
+ family_sysopnames = (
+ config_names[self.site.family.name] or config_names['*']
+ )
+ self.username = family_sysopnames.get(self.site.code, None)
try:
- family_sysopnames = config.sysopnames[self.site.family.name]
- self.username = family_sysopnames.get(self.site.code, None)
self.username = self.username or family_sysopnames['*']
except KeyError:
raise NoUsername(u"""\
@@ -104,11 +107,14 @@
% {'fam_name': self.site.family.name,
'wiki_code': self.site.code})
else:
+ config_names = config.usernames
+ family_usernames = (
+ config_names[self.site.family.name] or config_names['*']
+ )
+ self.username = family_usernames.get(self.site.code, None)
try:
- family_usernames = config.usernames[self.site.family.name]
- self.username = family_usernames.get(self.site.code, None)
self.username = self.username or family_usernames['*']
- except:
+ except KeyError:
raise NoUsername(u"""\
ERROR: Username for %(fam_name)s:%(wiki_code)s is undefined.
If you have an account for that site, please add a line to user-config.py:
diff --git a/tests/login_tests.py b/tests/login_tests.py
index 9c63bc3..b40ea7c 100644
--- a/tests/login_tests.py
+++ b/tests/login_tests.py
@@ -13,8 +13,13 @@
__version__ = '$Id$'
#
-import mock
+from collections import defaultdict
+try:
+ import unittest.mock as mock
+except ImportError:
+ import mock
+from pywikibot.exceptions import NoUsername
from pywikibot.login import LoginManager
from tests.aspects import (
@@ -35,17 +40,15 @@
code = "~FakeCode"
family = FakeFamily
+
FakeUsername = "~FakeUsername"
class FakeConfig(object):
"""Mock."""
- usernames = {
- FakeFamily.name: {
- FakeSite.code: FakeUsername
- }
- }
+ usernames = defaultdict(dict)
+ usernames[FakeFamily.name] = {FakeSite.code: FakeUsername}
@mock.patch("pywikibot.Site", FakeSite)
@@ -62,6 +65,24 @@
self.assertEqual(obj.username, FakeUsername)
self.assertEqual(obj.login_name, FakeUsername)
self.assertIsNone(obj.password)
+
+ @mock.patch.dict(
+ FakeConfig.usernames,
+ {'*': {'*': FakeUsername}},
+ clear=True
+ )
+ def test_star_family(self):
+ """Test LoginManager with '*' as family."""
+ lm = LoginManager()
+ self.assertEqual(lm.username, FakeUsername)
+
+ del FakeConfig.usernames['*']
+ FakeConfig.usernames['*']['en'] = FakeUsername
+ self.assertRaises(NoUsername, LoginManager)
+
+ FakeConfig.usernames['*']['*'] = FakeUsername
+ lm = LoginManager()
+ self.assertEqual(lm.username, FakeUsername)
@mock.patch("pywikibot.Site", FakeSite)
@@ -166,5 +187,6 @@
""", '~FakePassword')
self.assertEqual(obj.login_name, "~FakeUsername@~FakeSuffix")
+
if __name__ == '__main__': # pragma: no cover
unittest.main()
--
To view, visit https://gerrit.wikimedia.org/r/323518
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings
Gerrit-MessageType: merged
Gerrit-Change-Id: I23749a4035c7d27186a92e67c6d6206e10326ff0
Gerrit-PatchSet: 15
Gerrit-Project: pywikibot/core
Gerrit-Branch: master
Gerrit-Owner: Dalba <dalba.wiki(a)gmail.com>
Gerrit-Reviewer: Dalba <dalba.wiki(a)gmail.com>
Gerrit-Reviewer: Hashar <hashar(a)free.fr>
Gerrit-Reviewer: John Vandenberg <jayvdb(a)gmail.com>
Gerrit-Reviewer: Magul <tomasz.magulski(a)gmail.com>
Gerrit-Reviewer: Mpaa <mpaa.wiki(a)gmail.com>
Gerrit-Reviewer: Sn1per <geofbot(a)gmail.com>
Gerrit-Reviewer: Xqt <info(a)gno.de>
Gerrit-Reviewer: jenkins-bot <>
jenkins-bot has submitted this change and it was merged.
Change subject: [bugfix] Fix format string
......................................................................
[bugfix] Fix format string
- match.groups() gives a tuple. Passing to a format string gives a
TypeError when the tuple contains more or less elements than 1.
- Use exception message directly and use str.format() method.
Bug: T152499
Change-Id: I3eef158317882f239ef3bc4bd499b31fd5bb25ce
---
M pywikibot/textlib.py
1 file changed, 3 insertions(+), 3 deletions(-)
Approvals:
John Vandenberg: Looks good to me, approved
Dalba: Looks good to me, but someone else must approve
jenkins-bot: Verified
diff --git a/pywikibot/textlib.py b/pywikibot/textlib.py
index 48ca786..9f7782e 100644
--- a/pywikibot/textlib.py
+++ b/pywikibot/textlib.py
@@ -403,9 +403,9 @@
replacement += new[last:group_match.start()]
replacement += match.group(group_id) or ''
except IndexError:
- pywikibot.output('\nInvalid group reference: %s' % group_id)
- pywikibot.output('Groups found:\n%s' % match.groups())
- raise IndexError
+ raise IndexError(
+ 'Invalid group reference: {0}\nGroups found: {1}'
+ ''.format(group_id, match.groups()))
last = group_match.end()
replacement += new[last:]
--
To view, visit https://gerrit.wikimedia.org/r/325556
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings
Gerrit-MessageType: merged
Gerrit-Change-Id: I3eef158317882f239ef3bc4bd499b31fd5bb25ce
Gerrit-PatchSet: 2
Gerrit-Project: pywikibot/core
Gerrit-Branch: master
Gerrit-Owner: Xqt <info(a)gno.de>
Gerrit-Reviewer: Dalba <dalba.wiki(a)gmail.com>
Gerrit-Reviewer: John Vandenberg <jayvdb(a)gmail.com>
Gerrit-Reviewer: Xqt <info(a)gno.de>
Gerrit-Reviewer: jenkins-bot <>
jenkins-bot has submitted this change and it was merged.
Change subject: Improve fake user agent usage control
......................................................................
Improve fake user agent usage control
comms.http.get_fake_user_agent() is renamed to fake_user_agent() to match the style of user_agent(). Logic checking config variable fake_user_agent is removed, as it should not be responsible for deciding whether if fake UA should be used. Test cases testing the config-checking logic are removed.
The use_fake_user_agent argument is added to comms.http.fetch(), which will specify if fake UAs should be used when the method is called to make HTTP requests. Test cases testing this logic are added.
The fake_user_agent config variable is deprecated. fake_user_agent_default is introduced to set per-script behaviour. fake_user_agent_exceptions is introduced to set per-domain behaviours (will be checked by fetch()).
Bug: T152075
Change-Id: I28594fd1b5ccb6ed3e885db5600bb0464dccfa0e
---
M pywikibot/comms/http.py
M pywikibot/config2.py
M scripts/reflinks.py
M scripts/weblinkchecker.py
M tests/http_tests.py
5 files changed, 186 insertions(+), 61 deletions(-)
Approvals:
John Vandenberg: Looks good to me, approved
jenkins-bot: Verified
diff --git a/pywikibot/comms/http.py b/pywikibot/comms/http.py
index 908f3f1..7bf6235 100644
--- a/pywikibot/comms/http.py
+++ b/pywikibot/comms/http.py
@@ -38,10 +38,11 @@
if sys.version_info[0] > 2:
from http import cookiejar as cookielib
- from urllib.parse import quote
+ from urllib.parse import quote, urlparse
else:
import cookielib
from urllib2 import quote
+ from urlparse import urlparse
from pywikibot import config
@@ -53,6 +54,7 @@
)
from pywikibot.logging import critical, debug, error, log, warning
from pywikibot.tools import (
+ deprecated,
deprecate_arg,
file_mode_checker,
issue_deprecation_warning,
@@ -234,31 +236,43 @@
return formatted
+@deprecated('pywikibot.comms.http.fake_user_agent')
def get_fake_user_agent():
"""
- Return a user agent to be used when faking a web browser.
+ Return a fake user agent depending on `fake_user_agent` option in config.
+
+ Deprecated, use fake_user_agent() instead.
@rtype: str
"""
- # Check fake_user_agent configuration variable
if isinstance(config.fake_user_agent, StringTypes):
- return pywikibot.config2.fake_user_agent
+ return config.fake_user_agent
+ elif config.fake_user_agent or config.fake_user_agent is None:
+ return fake_user_agent()
+ else:
+ return user_agent()
- if config.fake_user_agent is None or config.fake_user_agent is True:
- try:
- import browseragents
- return browseragents.core.random()
- except ImportError:
- pass
- try:
- import fake_useragent
- return fake_useragent.fake.UserAgent().random
- except ImportError:
- pass
+def fake_user_agent():
+ """
+ Return a fake user agent.
- # Use the default real user agent
- return user_agent()
+ @rtype: str
+ """
+ try:
+ import browseragents
+ return browseragents.core.random()
+ except ImportError:
+ pass
+
+ try:
+ import fake_useragent
+ return fake_useragent.fake.UserAgent().random
+ except ImportError:
+ pass
+
+ raise ImportError( # Actually complain when neither is installed.
+ 'Either browseragents or fake_useragent must be installed to get fake UAs.')
@deprecate_arg('ssl', None)
@@ -443,7 +457,7 @@
def fetch(uri, method="GET", body=None, headers=None,
- default_error_handling=True, **kwargs):
+ default_error_handling=True, use_fake_user_agent=False, **kwargs):
"""
Blocking HTTP request.
@@ -454,8 +468,27 @@
@param default_error_handling: Use default error handling
@type default_error_handling: bool
+ @type use_fake_user_agent: bool, str
+ @param use_fake_user_agent: Set to True to use fake UA, False to use
+ pywikibot's UA, str to specify own UA. This behaviour might be
+ overridden by domain in config.
@rtype: L{threadedhttp.HttpRequest}
"""
+ # Change user agent depending on fake UA settings.
+ # Set header to new UA if needed.
+ headers = headers or {}
+ if not headers.get('user-agent', None): # Skip if already specified in request.
+ # Get fake UA exceptions from `fake_user_agent_exceptions` config.
+ uri_domain = urlparse(uri).netloc
+ use_fake_user_agent = config.fake_user_agent_exceptions.get(
+ uri_domain, use_fake_user_agent)
+
+ if use_fake_user_agent and isinstance(
+ use_fake_user_agent, StringTypes): # Custom UA.
+ headers['user-agent'] = use_fake_user_agent
+ elif use_fake_user_agent is True:
+ headers['user-agent'] = fake_user_agent()
+
request = _enqueue(uri, method, body, headers, **kwargs)
assert(request._data is not None) # if there's no data in the answer we're in trouble
# Run the error handling callback in the callers thread so exceptions
diff --git a/pywikibot/config2.py b/pywikibot/config2.py
index 9451eb5..a98aeb7 100644
--- a/pywikibot/config2.py
+++ b/pywikibot/config2.py
@@ -93,7 +93,7 @@
_private_values = ['authenticate', 'proxy', 'db_password']
_deprecated_variables = ['use_SSL_onlogin', 'use_SSL_always',
- 'available_ssl_project']
+ 'available_ssl_project', 'fake_user_agent']
# ############# ACCOUNT SETTINGS ##############
@@ -137,16 +137,22 @@
user_agent_format = ('{script_product} ({script_comments}) {pwb} ({revision}) '
'{http_backend} {python}')
-# Fake user agent
-# Used to retrieve pages in reflinks.py,
-# to work around user-agent sniffing webpages
-# When None or True,
-# Use random user agent if either browseragents or fake_useragent
-# packages are installed
-# Otherwise use pywikibot.comms.http.user_agent()
-# When set to False,
-# disables use of automatic user agents
-fake_user_agent = None
+# Fake user agent.
+# Some external websites reject bot-like user agents. It is possible to use
+# fake user agents in requests to these websites.
+# It is recommended to default this to False and use on an as-needed basis.
+#
+# Default behaviours in modules that can utilize fake UAs.
+# True for enabling fake UA, False for disabling / using pywikibot's own UA, str
+# to specify custom UA.
+fake_user_agent_default = {'reflinks': False, 'weblinkchecker': False}
+# Website domains excepted to the default behaviour.
+# True for enabling, False for disabling, str to hardcode a UA.
+# Example: {'problematic.site.example': True,
+# 'prefers.specific.ua.example': 'snakeoil/4.2'}
+fake_user_agent_exceptions = {}
+# This following option is deprecated in favour of finer control options above.
+fake_user_agent = False
# The default interface for communicating with the site
# currently the only defined interface is 'APISite', so don't change this!
diff --git a/scripts/reflinks.py b/scripts/reflinks.py
index c5daf1d..1e930a4 100755
--- a/scripts/reflinks.py
+++ b/scripts/reflinks.py
@@ -59,6 +59,7 @@
import pywikibot
from pywikibot import comms, i18n, pagegenerators, textlib, Bot
+from pywikibot import config2 as config
from pywikibot.pagegenerators import (
XMLDumpPageGenerator as _XMLDumpPageGenerator,
)
@@ -395,8 +396,7 @@
super(ReferencesRobot, self).__init__(**kwargs)
self.generator = generator
self.site = pywikibot.Site()
- self._user_agent = comms.http.get_fake_user_agent()
- pywikibot.log('Using fake user agent: {0}'.format(self._user_agent))
+ self._use_fake_user_agent = config.fake_user_agent_default.get('reflinks', False)
# Check
manual = 'mw:Manual:Pywikibot/refLinks'
code = None
@@ -494,7 +494,6 @@
raise
editedpages = 0
- headers = {'user-agent': self._user_agent}
for page in self.generator:
try:
# Load the page's text from the wiki
@@ -526,10 +525,11 @@
f = None
try:
- f = requests.get(ref.url, headers=headers, timeout=60)
+ f = comms.http.fetch(
+ ref.url, use_fake_user_agent=self._use_fake_user_agent)
# Try to get Content-Type from server
- contentType = f.headers.get('content-type')
+ contentType = f.response_headers.get('content-type')
if contentType and not self.MIME.search(contentType):
if ref.link.lower().endswith('.pdf') and \
not self.getOption('ignorepdf'):
@@ -556,7 +556,7 @@
continue
# Get the real url where we end (http redirects !)
- redir = f.url
+ redir = f.data.url
if redir != ref.link and \
domain.findall(redir) == domain.findall(link):
if soft404.search(redir) and \
@@ -572,15 +572,15 @@
u'Redirect to root : {0} ', ref.link))
continue
- if f.status_code != requests.codes.ok:
+ if f.status != requests.codes.ok:
pywikibot.output(u'HTTP error (%s) for %s on %s'
- % (f.status_code, ref.url,
+ % (f.status, ref.url,
page.title(asLink=True)),
toStdout=True)
# 410 Gone, indicates that the resource has been purposely
# removed
- if f.status_code == 410 or \
- (f.status_code == 404 and (u'\t%s\t' % ref.url in deadLinks)):
+ if f.status == 410 or \
+ (f.status == 404 and (u'\t%s\t' % ref.url in deadLinks)):
repl = ref.refDead()
new_text = new_text.replace(match.group(), repl)
continue
diff --git a/scripts/weblinkchecker.py b/scripts/weblinkchecker.py
index f81d8c3..b8cb323 100755
--- a/scripts/weblinkchecker.py
+++ b/scripts/weblinkchecker.py
@@ -279,6 +279,8 @@
Returns a (boolean, string) tuple saying if the page is online and including
a status reason.
+ Per-domain user-agent faking is not supported in this deprecated class.
+
Warning: Also returns false if your Internet connection isn't working
correctly! (This will give a Socket Error)
@@ -292,11 +294,19 @@
redirectChain is a list of redirects which were resolved by
resolveRedirect(). This is needed to detect redirect loops.
"""
- self._user_agent = comms.http.get_fake_user_agent()
self.url = url
self.serverEncoding = serverEncoding
+
+ fake_ua_config = config.fake_user_agent_default.get(
+ 'weblinkchecker', False)
+ if fake_ua_config and isinstance(fake_ua_config, str):
+ user_agent = fake_ua_config
+ elif fake_ua_config:
+ user_agent = comms.http.fake_user_agent()
+ else:
+ user_agent = comms.http.user_agent()
self.header = {
- 'User-agent': self._user_agent,
+ 'user-agent': user_agent,
'Accept': 'text/xml,application/xml,application/xhtml+xml,'
'text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5',
'Accept-Language': 'de-de,de;q=0.8,en-us;q=0.5,en;q=0.3',
@@ -542,10 +552,8 @@
threading.Thread.__init__(self)
self.page = page
self.url = url
- self._user_agent = comms.http.get_fake_user_agent()
self.history = history
self.header = {
- 'User-agent': self._user_agent,
'Accept': 'text/xml,application/xml,application/xhtml+xml,'
'text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5',
'Accept-Language': 'de-de,de;q=0.8,en-us;q=0.5,en;q=0.3',
@@ -557,6 +565,8 @@
self.setName((u'%s - %s' % (page.title(), url)).encode('utf-8',
'replace'))
self.HTTPignore = HTTPignore
+ self._use_fake_user_agent = config.fake_user_agent_default.get(
+ 'weblinkchecker', False)
self.day = day
def run(self):
@@ -564,8 +574,8 @@
ok = False
try:
header = self.header
- timeout = pywikibot.config.socket_timeout
- r = requests.get(self.url, headers=header, timeout=timeout)
+ r = comms.http.fetch(
+ self.url, headers=header, use_fake_user_agent=self._use_fake_user_agent)
except requests.exceptions.InvalidURL:
message = i18n.twtranslate(self.page.site,
'weblinkchecker-badurl_msg',
@@ -574,11 +584,11 @@
pywikibot.output('Exception while processing URL %s in page %s'
% (self.url, self.page.title()))
raise
- if (r.status_code == requests.codes.ok and
- str(r.status_code) not in self.HTTPignore):
+ if (r.status == requests.codes.ok and
+ str(r.status) not in self.HTTPignore):
ok = True
else:
- message = '{0} {1}'.format(r.status_code, r.reason)
+ message = '{0}'.format(r.status)
if ok:
if self.history.setLinkAlive(self.url):
pywikibot.output('*Link to %s in [[%s]] is back alive.'
diff --git a/tests/http_tests.py b/tests/http_tests.py
index 05df31b..06f7117 100644
--- a/tests/http_tests.py
+++ b/tests/http_tests.py
@@ -285,7 +285,7 @@
self.assertIn('Python/' + str(PYTHON_VERSION[0]), http.user_agent())
-class FakeUserAgentTestCase(TestCase):
+class DryFakeUserAgentTestCase(TestCase):
"""Test the generation of fake user agents.
@@ -296,15 +296,96 @@
net = False
+ def _test_fake_user_agent_randomness(self):
+ """Test if user agent returns are randomized."""
+ self.assertNotEqual(http.fake_user_agent(), http.fake_user_agent())
+
+ @require_modules('browseragents')
+ def test_with_browseragents(self):
+ """Test fake user agent generation with browseragents module."""
+ self._test_fake_user_agent_randomness()
+
+ @require_modules('fake_useragent')
+ def test_with_fake_useragent(self):
+ """Test fake user agent generation with fake_useragent module."""
+ self._test_fake_user_agent_randomness()
+
+
+class LiveFakeUserAgentTestCase(TestCase):
+
+ """Test the usage of fake user agent."""
+
+ sites = {
+ 'httpbin': {
+ 'hostname': 'httpbin.org',
+ },
+ }
+
+ def setUp(self):
+ """Set up the unit test."""
+ self.orig_fake_user_agent_exceptions = config.fake_user_agent_exceptions
+ super(LiveFakeUserAgentTestCase, self).setUp()
+
+ def tearDown(self):
+ """Tear down unit test."""
+ config.fake_user_agent_exceptions = self.orig_fake_user_agent_exceptions
+ super(LiveFakeUserAgentTestCase, self).tearDown()
+
+ def _test_fetch_use_fake_user_agent(self):
+ """Test `use_fake_user_agent` argument of http.fetch."""
+ # Existing headers
+ r = http.fetch(
+ 'http://httpbin.org/status/200', headers={'user-agent': 'EXISTING'})
+ self.assertEqual(r.headers['user-agent'], 'EXISTING')
+
+ # Argument value changes
+ r = http.fetch('http://httpbin.org/status/200', use_fake_user_agent=True)
+ self.assertNotEqual(r.headers['user-agent'], http.user_agent())
+ r = http.fetch('http://httpbin.org/status/200', use_fake_user_agent=False)
+ self.assertEqual(r.headers['user-agent'], http.user_agent())
+ r = http.fetch(
+ 'http://httpbin.org/status/200', use_fake_user_agent='ARBITRARY')
+ self.assertEqual(r.headers['user-agent'], 'ARBITRARY')
+
+ # Manually overridden domains
+ config.fake_user_agent_exceptions = {'httpbin.org': 'OVERRIDDEN'}
+ r = http.fetch(
+ 'http://httpbin.org/status/200', use_fake_user_agent=False)
+ self.assertEqual(r.headers['user-agent'], 'OVERRIDDEN')
+
+ @require_modules('browseragents')
+ def test_fetch_with_browseragents(self):
+ """Test method with browseragents module."""
+ self._test_fetch_use_fake_user_agent()
+
+ @require_modules('fake_useragent')
+ def test_fetch_with_fake_useragent(self):
+ """Test method with fake_useragent module."""
+ self._test_fetch_use_fake_user_agent()
+
+
+class GetFakeUserAgentTestCase(TestCase):
+
+ """Test the deprecated get_fake_user_agent()."""
+
+ net = False
+
def setUp(self):
"""Set up unit test."""
self.orig_fake_user_agent = config.fake_user_agent
+ super(GetFakeUserAgentTestCase, self).setUp()
def tearDown(self):
"""Tear down unit test."""
config.fake_user_agent = self.orig_fake_user_agent
+ super(GetFakeUserAgentTestCase, self).tearDown()
- def _test_fake_user_agent_config(self):
+ def _test_fake_user_agent_randomness(self):
+ """Test if user agent returns are randomized."""
+ config.fake_user_agent = True
+ self.assertNotEqual(http.get_fake_user_agent(), http.get_fake_user_agent())
+
+ def _test_config_settings(self):
"""Test if method honours configuration toggle."""
# ON: True and None in config are considered turned on.
config.fake_user_agent = True
@@ -315,25 +396,20 @@
# OFF: All other values won't make it return random UA.
config.fake_user_agent = False
self.assertEqual(http.get_fake_user_agent(), http.user_agent())
- config.fake_user_agent = 'ArbitraryValue'
- self.assertEqual(http.get_fake_user_agent(), 'ArbitraryValue')
-
- def _test_fake_user_agent_randomness(self):
- """Test if user agent returns are randomized."""
- config.fake_user_agent = True
- self.assertNotEqual(http.get_fake_user_agent(), http.get_fake_user_agent())
+ config.fake_user_agent = 'ARBITRARY'
+ self.assertEqual(http.get_fake_user_agent(), 'ARBITRARY')
@require_modules('browseragents')
def test_with_browseragents(self):
- """Test fake user agent generation with browseragents module."""
- self._test_fake_user_agent_config()
+ """Test method with browseragents module."""
self._test_fake_user_agent_randomness()
+ self._test_config_settings()
@require_modules('fake_useragent')
def test_with_fake_useragent(self):
- """Test fake user agent generation with fake_useragent module."""
- self._test_fake_user_agent_config()
+ """Test method with fake_useragent module."""
self._test_fake_user_agent_randomness()
+ self._test_config_settings()
class CharsetTestCase(TestCase):
--
To view, visit https://gerrit.wikimedia.org/r/325241
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings
Gerrit-MessageType: merged
Gerrit-Change-Id: I28594fd1b5ccb6ed3e885db5600bb0464dccfa0e
Gerrit-PatchSet: 17
Gerrit-Project: pywikibot/core
Gerrit-Branch: master
Gerrit-Owner: Dargasia <thx(a)riseup.net>
Gerrit-Reviewer: Dargasia <thx(a)riseup.net>
Gerrit-Reviewer: John Vandenberg <jayvdb(a)gmail.com>
Gerrit-Reviewer: jenkins-bot <>
jenkins-bot has submitted this change and it was merged.
Change subject: template.py: fix failed substitution in <poem> tag
......................................................................
template.py: fix failed substitution in <poem> tag
Bug: T151931
Change-Id: If8daaecb3ce343a0369c6f0ed60432126ed9abe0
---
M scripts/template.py
1 file changed, 5 insertions(+), 3 deletions(-)
Approvals:
Dalba: Looks good to me, approved
jenkins-bot: Verified
diff --git a/scripts/template.py b/scripts/template.py
index c4bbc6c..66f3e95 100755
--- a/scripts/template.py
+++ b/scripts/template.py
@@ -17,7 +17,9 @@
-subst Resolves the template by putting its text directly into the
article. This is done by changing {{...}} or {{msg:...}} into
- {{subst:...}}
+ {{subst:...}}.
+ Substitution is not available inside <ref>...</ref>,
+ <gallery>...</gallery> and <poem>...</poem> tags.
-assubst Replaces the first argument as old template with the second
argument as new template but substitutes it like -subst does.
@@ -221,11 +223,11 @@
if self.getOption('subst') and self.getOption('remove'):
replacements.append((templateRegex,
r'{{subst:%s\g<parameters>}}' % new))
- exceptions['inside-tags'] = ['ref', 'gallery']
+ exceptions['inside-tags'] = ['ref', 'gallery', 'poem']
elif self.getOption('subst'):
replacements.append((templateRegex,
r'{{subst:%s\g<parameters>}}' % old))
- exceptions['inside-tags'] = ['ref', 'gallery']
+ exceptions['inside-tags'] = ['ref', 'gallery', 'poem']
elif self.getOption('remove'):
replacements.append((templateRegex, ''))
else:
--
To view, visit https://gerrit.wikimedia.org/r/324267
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings
Gerrit-MessageType: merged
Gerrit-Change-Id: If8daaecb3ce343a0369c6f0ed60432126ed9abe0
Gerrit-PatchSet: 6
Gerrit-Project: pywikibot/core
Gerrit-Branch: master
Gerrit-Owner: Mpaa <mpaa.wiki(a)gmail.com>
Gerrit-Reviewer: Dalba <dalba.wiki(a)gmail.com>
Gerrit-Reviewer: John Vandenberg <jayvdb(a)gmail.com>
Gerrit-Reviewer: Magul <tomasz.magulski(a)gmail.com>
Gerrit-Reviewer: Mpaa <mpaa.wiki(a)gmail.com>
Gerrit-Reviewer: jenkins-bot <>