jenkins-bot has submitted this change and it was merged. (
https://gerrit.wikimedia.org/r/288871 )
Change subject: Use bs4 for imageharvest, not BeautifulSoup v3
......................................................................
Use bs4 for imageharvest, not BeautifulSoup v3
Change necessary scripts in scripts/ and tests/ to ensure that
imageharvest uses BeautifulSoup v4, not v3
Bug: T115428
Change-Id: Icfd98d357126623f9431797188a52fcbcfe40dbc
---
M scripts/imageharvest.py
M tests/script_tests.py
2 files changed, 19 insertions(+), 8 deletions(-)
Approvals:
jenkins-bot: Verified
Xqt: Looks good to me, approved
diff --git a/scripts/imageharvest.py b/scripts/imageharvest.py
index 300676d..ddbf06a 100644
--- a/scripts/imageharvest.py
+++ b/scripts/imageharvest.py
@@ -5,7 +5,7 @@
It takes a URL as an argument and finds all images (and other files specified
by the extensions in 'fileformats') that URL is referring to, asking whether to
upload them. If further arguments are given, they are considered to be the text
-that is common to the descriptions.
+that is common to the descriptions. BeautifulSoup is needed only in this case.
A second use is to get a number of images that have URLs only differing in
numbers. To do this, use the command line option "-pattern", and give the URL
@@ -24,10 +24,13 @@
from __future__ import absolute_import, unicode_literals
__version__ = '$Id$'
-#
+
import os
-import BeautifulSoup
+try:
+ from bs4 import BeautifulSoup
+except ImportError as e:
+ BeautifulSoup = e
import pywikibot
@@ -45,11 +48,15 @@
def get_imagelinks(url):
"""Given a URL, get all images linked to by the page at that
URL."""
+ # Check if BeautifulSoup is imported.
+ if isinstance(BeautifulSoup, ImportError):
+ raise BeautifulSoup
+
links = []
uo = URLopener()
- file = uo.open(url)
- soup = BeautifulSoup.BeautifulSoup(file.read())
- file.close()
+ with uo.open(url) as f:
+ soup = BeautifulSoup(f.read())
+
if not shown:
tagname = "a"
elif shown == "just":
diff --git a/tests/script_tests.py b/tests/script_tests.py
index 9a20c8c..b305a2b 100644
--- a/tests/script_tests.py
+++ b/tests/script_tests.py
@@ -40,9 +40,8 @@
'imagecopy_self': [TK_IMPORT],
'script_wui': ['crontab', 'lua'],
# Note: package 'lunatic-python' provides module 'lua'
-
'flickrripper': ['flickrapi'],
- 'imageharvest': ['BeautifulSoup'],
+ 'imageharvest': ['beautifulsoup4'],
'match_images': ['PIL.ImageTk'],
'states_redirect': ['pycountry'],
'patrol': ['mwparserfromhell'],
@@ -391,6 +390,11 @@
net = False
_expected_failures = failed_dep_script_list
+ # -help supported not explicitly
+ try:
+ _expected_failures.remove('imageharvest')
+ except ValueError:
+ pass
_allowed_failures = []
_argument = 'help'
--
To view, visit
https://gerrit.wikimedia.org/r/288871
To unsubscribe, visit
https://gerrit.wikimedia.org/r/settings
Gerrit-MessageType: merged
Gerrit-Change-Id: Icfd98d357126623f9431797188a52fcbcfe40dbc
Gerrit-PatchSet: 13
Gerrit-Project: pywikibot/core
Gerrit-Branch: master
Gerrit-Owner: Darthbhyrava <hbhyrava(a)gmail.com>
Gerrit-Reviewer: Darthbhyrava <hbhyrava(a)gmail.com>
Gerrit-Reviewer: John Vandenberg <jayvdb(a)gmail.com>
Gerrit-Reviewer: Magul <tomasz.magulski(a)gmail.com>
Gerrit-Reviewer: Merlijn van Deen <valhallasw(a)arctus.nl>
Gerrit-Reviewer: Mpaa <mpaa.wiki(a)gmail.com>
Gerrit-Reviewer: MtDu <justin.d128(a)gmail.com>
Gerrit-Reviewer: Xqt <info(a)gno.de>
Gerrit-Reviewer: jenkins-bot <>