jenkins-bot has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/354719 )
Change subject: Page might have been moved without redirect or deleted. Test if the page actually exists.
......................................................................
Page might have been moved without redirect or deleted.
Test if the page actually exists.
Bug: T86491
Change-Id: Id5b6b0799bcac2aa69b26e9e8a7dd86e282f95c9
---
M scripts/newitem.py
1 file changed, 3 insertions(+), 0 deletions(-)
Approvals:
jenkins-bot: Verified
Xqt: Looks good to me, approved
diff --git a/scripts/newitem.py b/scripts/newitem.py
index 5ee4901..52ca0b6 100755
--- a/scripts/newitem.py
+++ b/scripts/newitem.py
@@ -94,6 +94,9 @@
self.current_page = page
+ if not page.exists():
+ pywikibot.output('%s does not exist. Skipping.' % page)
+ return
if page.isRedirectPage():
pywikibot.output(u'%s is a redirect page. Skipping.' % page)
return
--
To view, visit https://gerrit.wikimedia.org/r/354719
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings
Gerrit-MessageType: merged
Gerrit-Change-Id: Id5b6b0799bcac2aa69b26e9e8a7dd86e282f95c9
Gerrit-PatchSet: 2
Gerrit-Project: pywikibot/core
Gerrit-Branch: master
Gerrit-Owner: Multichill <maarten(a)mdammers.nl>
Gerrit-Reviewer: Lokal Profil <lokal.profil(a)gmail.com>
Gerrit-Reviewer: Magul <tomasz.magulski(a)gmail.com>
Gerrit-Reviewer: Mpaa <mpaa.wiki(a)gmail.com>
Gerrit-Reviewer: Multichill <maarten(a)mdammers.nl>
Gerrit-Reviewer: Xqt <info(a)gno.de>
Gerrit-Reviewer: jenkins-bot <>
jenkins-bot has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/354697 )
Change subject: Make the tests pass again: * https://en.wikisource.org/w/index.php?type=revision&diff=6811951&oldid=4889… made the number of validated pages increase * https://en.wikipedia.org/w/index.php?type=revision&diff=780329438&oldid=780… made the edit test
......................................................................
Make the tests pass again:
* https://en.wikisource.org/w/index.php?type=revision&diff=6811951&oldid=4889…
made the number of validated pages increase
* https://en.wikipedia.org/w/index.php?type=revision&diff=780329438&oldid=780… made the edit test fail
* wikistats is broken, just disabled the tests
Bug: T165830
Change-Id: Iee6f815f7995eb6dc60d3a91f69fc24a6f101037
---
M tests/pagegenerators_tests.py
M tests/wikistats_tests.py
2 files changed, 6 insertions(+), 4 deletions(-)
Approvals:
jenkins-bot: Verified
Xqt: Looks good to me, approved
diff --git a/tests/pagegenerators_tests.py b/tests/pagegenerators_tests.py
index b24180d..6eaca83 100755
--- a/tests/pagegenerators_tests.py
+++ b/tests/pagegenerators_tests.py
@@ -238,7 +238,7 @@
site = self.site
gen = pagegenerators.PagesFromTitlesGenerator(self.titles, site)
gen = pagegenerators.CategoryFilterPageGenerator(gen, self.catfilter_list, site)
- self.assertEqual(len(tuple(gen)), 9)
+ self.assertEqual(len(tuple(gen)), 10)
class TestQualityFilterPageGenerator(TestCase):
@@ -314,12 +314,12 @@
gen, last_edit_end=two_days_ago)
self.assertEqual(len(list(gen)), 0)
- gen = PagesFromTitlesGenerator(['Template:Sidebox'], self.site)
+ gen = PagesFromTitlesGenerator(['Template:Side box'], self.site)
gen = pagegenerators.EdittimeFilterPageGenerator(
gen, last_edit_end=nine_days_ago)
self.assertEqual(len(list(gen)), 1)
- gen = PagesFromTitlesGenerator(['Template:Sidebox'], self.site)
+ gen = PagesFromTitlesGenerator(['Template:Side box'], self.site)
gen = pagegenerators.EdittimeFilterPageGenerator(
gen, last_edit_start=nine_days_ago)
self.assertEqual(len(list(gen)), 0)
diff --git a/tests/wikistats_tests.py b/tests/wikistats_tests.py
index f866eaf..c373ae3 100644
--- a/tests/wikistats_tests.py
+++ b/tests/wikistats_tests.py
@@ -1,7 +1,7 @@
# -*- coding: utf-8 -*-
"""Test cases for the WikiStats dataset."""
#
-# (C) Pywikibot team, 2014-2016
+# (C) Pywikibot team, 2014-2017
#
# Distributed under the terms of the MIT license.
#
@@ -15,6 +15,8 @@
from tests.aspects import unittest, TestCase
+(a)unittest.skip('Wikistats at https://wikistats.wmflabs.org/ '
+ 'appears to be broken. See T165830.')
class WikiStatsTestCase(TestCase):
"""Test WikiStats dump."""
--
To view, visit https://gerrit.wikimedia.org/r/354697
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings
Gerrit-MessageType: merged
Gerrit-Change-Id: Iee6f815f7995eb6dc60d3a91f69fc24a6f101037
Gerrit-PatchSet: 3
Gerrit-Project: pywikibot/core
Gerrit-Branch: master
Gerrit-Owner: Multichill <maarten(a)mdammers.nl>
Gerrit-Reviewer: Magul <tomasz.magulski(a)gmail.com>
Gerrit-Reviewer: Xqt <info(a)gno.de>
Gerrit-Reviewer: jenkins-bot <>
jenkins-bot has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/354714 )
Change subject: Config2: limit the number of retries to 15
......................................................................
Config2: limit the number of retries to 15
This limits the waiting time until timeout from 43 min to 23.
Bug: T165898
Change-Id: Icb13572c91c2eb5ab1e8844ab4073205fe1f29f3
---
M pywikibot/config2.py
1 file changed, 1 insertion(+), 1 deletion(-)
Approvals:
jenkins-bot: Verified
Xqt: Looks good to me, approved
diff --git a/pywikibot/config2.py b/pywikibot/config2.py
index 9f732f2..78898cc 100644
--- a/pywikibot/config2.py
+++ b/pywikibot/config2.py
@@ -649,7 +649,7 @@
step = -1
# Maximum number of times to retry an API request before quitting.
-max_retries = 25
+max_retries = 15
# Minimum time to wait before resubmitting a failed API request.
retry_wait = 5
--
To view, visit https://gerrit.wikimedia.org/r/354714
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings
Gerrit-MessageType: merged
Gerrit-Change-Id: Icb13572c91c2eb5ab1e8844ab4073205fe1f29f3
Gerrit-PatchSet: 4
Gerrit-Project: pywikibot/core
Gerrit-Branch: master
Gerrit-Owner: Strainu <wiki(a)strainu.ro>
Gerrit-Reviewer: Magul <tomasz.magulski(a)gmail.com>
Gerrit-Reviewer: Strainu <wiki(a)strainu.ro>
Gerrit-Reviewer: Xqt <info(a)gno.de>
Gerrit-Reviewer: jenkins-bot <>
jenkins-bot has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/354590 )
Change subject: Remove WDQ from pywikibot
......................................................................
Remove WDQ from pywikibot
Bug: T162585
Change-Id: I8dadf884b7255a1fb4af18c8892df3f8c443ab58
---
M docs/api_ref/pywikibot.data.rst
M docs/api_ref/tests/index.rst
D docs/api_ref/tests/wikidataquery_tests.rst
M pywikibot/README.rst
D pywikibot/data/wikidataquery.py
M pywikibot/pagegenerators.py
M tests/__init__.py
M tests/aspects.py
D tests/wikidataquery_tests.py
9 files changed, 1 insertion(+), 991 deletions(-)
Approvals:
Lokal Profil: Looks good to me, approved
Jean-Frédéric: Looks good to me, but someone else must approve
jenkins-bot: Verified
diff --git a/docs/api_ref/pywikibot.data.rst b/docs/api_ref/pywikibot.data.rst
index 16f51f7..a42e13e 100644
--- a/docs/api_ref/pywikibot.data.rst
+++ b/docs/api_ref/pywikibot.data.rst
@@ -17,13 +17,6 @@
:undoc-members:
:show-inheritance:
-pywikibot.data.wikidataquery module
------------------------------------
-
-.. automodule:: pywikibot.data.wikidataquery
- :members:
- :undoc-members:
- :show-inheritance:
pywikibot.data.wikistats module
-------------------------------
diff --git a/docs/api_ref/tests/index.rst b/docs/api_ref/tests/index.rst
index 224328f..92957b4 100644
--- a/docs/api_ref/tests/index.rst
+++ b/docs/api_ref/tests/index.rst
@@ -29,7 +29,6 @@
edit_failure<./edit_failure_tests>
timestripper<./timestripper_tests>
pagegenerators<./pagegenerators_tests>
- wikidataquery<./wikidataquery_tests>
weblib<./weblib_tests>
i18n<./i18n_tests>
wikistats<./wikistats_tests>
diff --git a/docs/api_ref/tests/wikidataquery_tests.rst b/docs/api_ref/tests/wikidataquery_tests.rst
deleted file mode 100644
index 9307550..0000000
--- a/docs/api_ref/tests/wikidataquery_tests.rst
+++ /dev/null
@@ -1,15 +0,0 @@
-===================
-wikidataquery_tests
-===================
- Tests in ``tests.wikidataquery_tests``:
-
----------------
-Available tests
----------------
- .. autoclass:: tests.wikidataquery_tests.TestDryApiFunctions
- :members:
- .. autoclass:: tests.wikidataquery_tests.TestLiveApiFunctions
- :members:
- .. autoclass:: tests.wikidataquery_tests.TestApiSlowFunctions
- :members:
-
diff --git a/pywikibot/README.rst b/pywikibot/README.rst
index deb70bb..ad05394 100644
--- a/pywikibot/README.rst
+++ b/pywikibot/README.rst
@@ -136,9 +136,6 @@
+---------------------------+-------------------------------------------------------+
| sparql.py | Objects representing SPARQL query API |
+---------------------------+-------------------------------------------------------+
- | wikidataquery.py | Objects representing WikidataQuery query syntax |
- | | and API |
- +---------------------------+-------------------------------------------------------+
| wikistats.py | Objects representing WikiStats API |
+---------------------------+-------------------------------------------------------+
diff --git a/pywikibot/data/wikidataquery.py b/pywikibot/data/wikidataquery.py
deleted file mode 100644
index f28a376..0000000
--- a/pywikibot/data/wikidataquery.py
+++ /dev/null
@@ -1,633 +0,0 @@
-# -*- coding: utf-8 -*-
-"""Objects representing WikidataQuery query syntax and API."""
-#
-# (C) Pywikibot team, 2013
-#
-# Distributed under the terms of the MIT license.
-from __future__ import absolute_import, unicode_literals
-
-import hashlib
-import json
-import os
-import pickle
-import sys
-import tempfile
-import time
-
-if sys.version_info[0] > 2:
- from urllib.parse import quote
- basestring = (str, )
-else:
- from urllib2 import quote
-
-import pywikibot
-
-from pywikibot.comms import http
-
-from pywikibot import config
-from pywikibot.page import ItemPage, PropertyPage, Claim
-
-
-def listify(x):
- """
- If given a non-list, encapsulate in a single-element list.
-
- @rtype: list
- """
- return x if isinstance(x, list) else [x]
-
-
-class QuerySet(object):
-
- """
- A QuerySet represents a set of queries or other query sets.
-
- Queries may be joined by operators (AND and OR).
-
- A QuerySet stores this information as a list of Query(Sets) and
- a joiner operator to join them all together
- """
-
- def __init__(self, q):
- """
- Initialise a query set from a Query or another QuerySet.
-
- @type q: Query or QuerySet
- """
- self.qs = [q]
-
- def addJoiner(self, args, joiner):
- """
- Add to this QuerySet using the given joiner.
-
- If the given joiner is not the same as we used before in
- this QuerySet, nest the current one in parens before joining.
- This makes the implicit grouping of the API explicit.
-
- @return: a new query set representing the joining of this one and
- the arguments
- """
- if len(self.qs) > 1 and joiner != self.joiner:
- left = QuerySet(self)
- else:
- left = self
-
- left.joiner = joiner
-
- for a in listify(args):
- left.qs.append(a)
-
- return left
-
- def AND(self, args):
- """
- Add the given args (Queries or QuerySets) to the Query set as a logical conjuction (AND).
-
- @type args: Query or QuerySet
- """
- return self.addJoiner(args, "AND")
-
- def OR(self, args):
- """
- Add the given args (Queries or QuerySets) to the Query set as a logical disjunction (OR).
-
- @type args: Query or QuerySet
- """
- return self.addJoiner(args, "OR")
-
- def __str__(self):
- """
- Output as an API-ready string.
-
- @rtype: str
- """
- def bracketIfQuerySet(q):
- if isinstance(q, QuerySet) and q.joiner != self.joiner:
- return "(%s)" % q
- else:
- return str(q)
-
- s = bracketIfQuerySet(self.qs[0])
-
- for q in self.qs[1:]:
- s += " %s %s" % (self.joiner, bracketIfQuerySet(q))
-
- return s
-
- def __repr__(self):
- """Return a string representation."""
- return u"QuerySet(%s)" % self
-
-
-class Query(object):
-
- """
- A query is a single query for the WikidataQuery API.
-
- For example:
- claim[100:60] or link[enwiki]
-
- Construction of a Query can throw a TypeError if you feed it bad
- parameters. Exactly what these need to be depends on the Query
- """
-
- def AND(self, ands):
- """
- Produce a query set ANDing this query and all the given query/sets.
-
- @type ands: Query or list of Query
- """
- return QuerySet(self).addJoiner(ands, "AND")
-
- def OR(self, ors):
- """
- Produce a query set ORing this query and all the given query/sets.
-
- @type ors: Query or list of Query
- """
- return QuerySet(self).addJoiner(ors, "OR")
-
- def formatItem(self, item):
- """
- Default item formatting is string.
-
- This will work for queries, querysets, ints and strings
- """
- return str(item)
-
- def formatList(self, l):
- """
- Format and comma-join a list.
-
- @type l: list
- """
- return ",".join([self.formatItem(x) for x in l])
-
- @staticmethod
- def isOrContainsOnlyTypes(items, types):
- """
- Either this item is one of the given types, or it is a list of only those types.
-
- @rtype: bool
- """
- if isinstance(items, list):
- for x in items:
- found = False
- for typ in listify(types):
- if isinstance(x, typ):
- found = True
- break
-
- if not found:
- return False
- else:
- for typ in listify(types):
- found = False
- if isinstance(items, typ):
- found = True
- break
-
- if not found:
- return False
-
- return True
-
- def validate(self):
- """
- Validate the query parameters.
-
- Default validate result is a pass - subclasses need to implement
- this if they want to check their parameters.
-
- @return: True
- @rtype: bool
- """
- return True
-
- def validateOrRaise(self, msg=None):
- """Validate the contents and raise TypeError if the validation fails."""
- if not self.validate():
- raise TypeError(msg)
-
- def convertWDType(self, item):
- """
- Convert Wikibase items like ItemPage or PropertyPage into integer IDs.
-
- The resulting IDs may be used in query strings.
-
- @param item: A single item. One of ItemPages, PropertyPages, int
- or anything that can be fed to int()
-
- @return: the int ID of the item
- """
- if isinstance(item, ItemPage) or isinstance(item, PropertyPage):
- return item.getID(numeric=True)
- else:
- return int(item)
-
- def convertWDTypes(self, items):
- """Convert the items into integer IDs using L{Query.convertWDType}."""
- return [self.convertWDType(x) for x in listify(items)]
-
- def __str__(self):
- """
- Generate a query string to be passed to the WDQ API.
-
- Sub-classes must override this method.
-
- @raises NotImplementedError: Always raised by this abstract method
- """
- raise NotImplementedError
-
- def __repr__(self):
- """Return a string representation."""
- return u"Query(%s)" % self
-
-
-class HasClaim(Query):
-
- """
- This is a Query of the form "claim[prop:val]".
-
- It is subclassed by
- the other similar forms like noclaim and string
- """
-
- queryType = "claim"
-
- def __init__(self, prop, items=[]):
- """Constructor."""
- self.prop = self.convertWDType(prop)
-
- if isinstance(items, Query):
- self.items = items
- elif isinstance(self, StringClaim):
- self.items = listify(items)
- else:
- self.items = self.convertWDTypes(items)
-
- self.validateOrRaise()
-
- def formatItems(self):
- """Format the items when they are a list."""
- res = ''
- if self.items:
- res += ":" + ",".join([self.formatItem(x) for x in self.items])
-
- return res
-
- def validate(self):
- """Validate that the items are ints or Querys."""
- return self.isOrContainsOnlyTypes(self.items, [int, Query])
-
- def __str__(self):
- """Return the query string for the API."""
- if isinstance(self.items, list):
- return "%s[%s%s]" % (self.queryType, self.prop, self.formatItems())
- elif isinstance(self.items, Query):
- return "%s[%s:(%s)]" % (self.queryType, self.prop, self.items)
-
-
-class NoClaim(HasClaim):
-
- """Query of the form noclaim[PROPERTY]."""
-
- queryType = "noclaim"
-
-
-class StringClaim(HasClaim):
-
- """Query of the form string[PROPERTY:"STRING",...]."""
-
- queryType = "string"
-
- def formatItem(self, x):
- """Add quotes around string."""
- return '"%s"' % x
-
- def validate(self):
- """Validate that the items are strings."""
- return self.isOrContainsOnlyTypes(self.items, basestring)
-
-
-class Tree(Query):
-
- """Query of the form tree[ITEM,...][PROPERTY,...]<PROPERTY,...>."""
-
- queryType = "tree"
-
- def __init__(self, item, forward=[], reverse=[]):
- """
- Constructor.
-
- @param item: The root item
- @param forward: List of forward properties, can be empty
- @param reverse: List of reverse properties, can be empty
- """
- # check sensible things coming in, as we lose info once we do
- # type conversion
- if not self.isOrContainsOnlyTypes(item, [int, ItemPage]):
- raise TypeError("The item paramter must contain or be integer IDs "
- "or page.ItemPages")
- elif (not self.isOrContainsOnlyTypes(forward, [int, PropertyPage]) or
- not self.isOrContainsOnlyTypes(reverse, [int, PropertyPage])):
- raise TypeError("The forward and reverse parameters must contain "
- "or be integer IDs or page.PropertyPages")
-
- self.item = self.convertWDTypes(item)
- self.forward = self.convertWDTypes(forward)
- self.reverse = self.convertWDTypes(reverse)
-
- self.validateOrRaise()
-
- def validate(self):
- """Validate that the item, forward and reverse are all ints."""
- return (self.isOrContainsOnlyTypes(self.item, int) and
- self.isOrContainsOnlyTypes(self.forward, int) and
- self.isOrContainsOnlyTypes(self.reverse, int))
-
- def __str__(self):
- """Return the query string for the API."""
- return "%s[%s][%s][%s]" % (self.queryType, self.formatList(self.item),
- self.formatList(self.forward),
- self.formatList(self.reverse))
-
-
-class Around(Query):
-
- """A query in the form around[PROPERTY,LATITUDE,LONGITUDE,RADIUS]."""
-
- queryType = "around"
-
- def __init__(self, prop, coord, rad):
- """Constructor."""
- self.prop = self.convertWDType(prop)
- self.lt = coord.lat
- self.lg = coord.lon
- self.rad = rad
-
- def validate(self):
- """Validate that the prop is an int."""
- return isinstance(self.prop, int)
-
- def __str__(self):
- """Return the query string for the API."""
- return "%s[%s,%s,%s,%s]" % (self.queryType, self.prop,
- self.lt, self.lg, self.rad)
-
-
-class Between(Query):
-
- """
- A query in the form between[PROP, BEGIN, END].
-
- You have to give prop and one of begin or end. Note that times have
- to be in UTC, timezones are not supported by the API
-
- @param prop: the property
- @param begin: WbTime object representing the beginning of the period
- @param end: WbTime object representing the end of the period
- """
-
- queryType = "between"
-
- def __init__(self, prop, begin=None, end=None):
- """Constructor."""
- self.prop = self.convertWDType(prop)
- self.begin = begin
- self.end = end
-
- def validate(self):
- """Validate that a range is given and the prop is an int."""
- return (self.begin or self.end) and isinstance(self.prop, int)
-
- def __str__(self):
- """Return the query string for the API."""
- begin = self.begin.toTimestr() if self.begin else ''
-
- # if you don't have an end, you don't put in the comma
- end = ',' + self.end.toTimestr() if self.end else ''
-
- return "%s[%s,%s%s]" % (self.queryType, self.prop, begin, end)
-
-
-class Link(Query):
-
- """
- A query in the form link[LINK,...], which also includes nolink.
-
- All link elements have to be strings, or validation will throw
- """
-
- queryType = "link"
-
- def __init__(self, link):
- """Constructor."""
- self.link = listify(link)
- self.validateOrRaise()
-
- def validate(self):
- """Validate that the link is a string."""
- return self.isOrContainsOnlyTypes(self.link, basestring)
-
- def __str__(self):
- """Return the query string for the API."""
- return "%s[%s]" % (self.queryType, self.formatList(self.link))
-
-
-class NoLink(Link):
-
- """A query in the form nolink[..]."""
-
- queryType = "nolink"
-
-
-def fromClaim(claim):
- """
- Construct from a pywikibot.page Claim object.
-
- @type claim: L{pywikibot.page.Claim}
- @rtype: L{Query}
- """
- if not isinstance(claim, Claim):
- raise TypeError("claim must be a page.Claim")
-
- if claim.type == 'wikibase-item':
- return HasClaim(claim.getID(numeric=True), claim.getTarget().getID(numeric=True))
- if claim.type == 'commonsMedia':
- return StringClaim(claim.getID(numeric=True),
- claim.getTarget().title(withNamespace=False))
- if claim.type in ('string', 'url', 'math', 'external-id'):
- return StringClaim(claim.getID(numeric=True), claim.getTarget())
- else:
- raise TypeError("Cannot construct a query from a claim of type %s"
- % claim.type)
-
-
-class WikidataQuery(object):
-
- """
- An interface to the WikidataQuery API.
-
- Default host is
- https://wdq.wmflabs.org/, but you can substitute
- a different one.
-
- Caching defaults to a subdir of the system temp directory with a
- 1 hour max cache age.
-
- Set a zero or negative maxCacheAge to disable caching
- """
-
- def __init__(self, host="https://wdq.wmflabs.org", cacheDir=None,
- cacheMaxAge=60):
- """Constructor."""
- self.host = host
- self.cacheMaxAge = cacheMaxAge
-
- if cacheDir:
- self.cacheDir = cacheDir
- else:
- self.cacheDir = os.path.join(tempfile.gettempdir(),
- "wikidataquery_cache")
-
- def getUrl(self, queryStr):
- """Get the URL given the query string."""
- return "%s/api?%s" % (self.host, queryStr)
-
- def getQueryString(self, q, labels=[], props=[]):
- """
- Get the query string for a given query or queryset.
-
- @return: string including labels and props
- """
- qStr = "q=%s" % quote(str(q))
-
- if labels:
- qStr += "&labels=%s" % ','.join(labels)
-
- if props:
- qStr += "&props=%s" % ','.join(props)
-
- return qStr
-
- def getCacheFilename(self, queryStr):
- """
- Encode a query into a unique and universally safe format.
-
- @rtype: unicode
- """
- encQuery = hashlib.sha1(queryStr.encode('utf8')).hexdigest() + ".wdq_cache"
- return os.path.join(self.cacheDir, encQuery)
-
- def readFromCache(self, queryStr):
- """
- Load the query result from the cache, if possible.
-
- @return: None if the data is not there or if it is too old.
- """
- if self.cacheMaxAge <= 0:
- return None
-
- cacheFile = self.getCacheFilename(queryStr)
-
- if os.path.isfile(cacheFile):
- mtime = os.path.getmtime(cacheFile)
- now = time.time()
-
- if ((now - mtime) / 60) < self.cacheMaxAge:
- with open(cacheFile, 'rb') as f:
- try:
- data = pickle.load(f)
- except pickle.UnpicklingError:
- pywikibot.warning(u"Couldn't read cached data from %s"
- % cacheFile)
- data = None
-
- return data
-
- return None
-
- def saveToCache(self, q, data):
- """
- Save data from a query to a cache file, if enabled.
-
- @rtype: None
- """
- if self.cacheMaxAge <= 0:
- return
-
- # we have to use our own query string, as otherwise we may
- # be able to find the cache file again if there are e.g.
- # whitespace differences
- cacheFile = self.getCacheFilename(q)
-
- if os.path.exists(cacheFile) and not os.path.isfile(cacheFile):
- return
-
- if not os.path.exists(self.cacheDir):
- os.makedirs(self.cacheDir)
-
- with open(cacheFile, 'wb') as f:
- try:
- pickle.dump(data, f, protocol=config.pickle_protocol)
- except IOError:
- pywikibot.warning(u"Failed to write cache file %s" % cacheFile)
-
- def getDataFromHost(self, queryStr):
- """
- Go and fetch a query from the host's API.
-
- @rtype: dict
- """
- url = self.getUrl(queryStr)
-
- try:
- resp = http.fetch(url)
- except:
- pywikibot.warning(u"Failed to retrieve %s" % url)
- raise
-
- data = resp.content
- if not data:
- pywikibot.warning('No data received for %s' % url)
- raise pywikibot.ServerError('No data received for %s' % url)
-
- try:
- data = json.loads(data)
- except ValueError:
- pywikibot.warning(
- 'Data received for %s but no JSON could be decoded: %r'
- % (url, data))
- raise pywikibot.ServerError(
- 'Data received for %s but no JSON could be decoded: %r'
- % (url, data))
-
- return data
-
- def query(self, q, labels=[], props=[]):
- """
- Actually run a query over the API.
-
- @return: dict of the interpreted JSON or None on failure
- """
- fullQueryString = self.getQueryString(q, labels, props)
-
- # try to get cached data first
- data = self.readFromCache(fullQueryString)
-
- if data:
- return data
-
- # the cached data must not be OK, go and get real data from the
- # host's API
- data = self.getDataFromHost(fullQueryString)
-
- # no JSON found
- if not data:
- return None
-
- # cache data for next time
- self.saveToCache(fullQueryString, data)
-
- return data
diff --git a/pywikibot/pagegenerators.py b/pywikibot/pagegenerators.py
index 96f2100..c4be421 100644
--- a/pywikibot/pagegenerators.py
+++ b/pywikibot/pagegenerators.py
@@ -265,9 +265,6 @@
"SELECT page_namespace, page_title, FROM page
WHERE page_namespace = 0" and works on the resulting pages.
--wikidataquery Takes a WikidataQuery query string like claim[31:12280]
- and works on the resulting pages.
-
-sparql Takes a SPARQL SELECT query string including ?item
and works on the resulting pages.
@@ -922,9 +919,7 @@
elif arg == '-untagged':
issue_deprecation_warning(arg, None, 2)
elif arg == '-wikidataquery':
- if not value:
- value = pywikibot.input('WikidataQuery string:')
- gen = WikidataQueryPageGenerator(value, site=self.site)
+ issue_deprecation_warning(arg, None, 2)
elif arg == '-sparqlendpoint':
if not value:
value = pywikibot.input('SPARQL endpoint:')
@@ -2723,39 +2718,6 @@
for item in entities)
for sitelink in sitelinks:
yield pywikibot.Page(site, sitelink)
-
-
-def WikidataQueryPageGenerator(query, site=None):
- """Generate pages that result from the given WikidataQuery.
-
- @param query: the WikidataQuery query string.
- @param site: Site for generator results.
- @type site: L{pywikibot.site.BaseSite}
-
- """
- from pywikibot.data import wikidataquery as wdquery
-
- if site is None:
- site = pywikibot.Site()
- repo = site.data_repository()
- is_repo = isinstance(site, pywikibot.site.DataSite)
-
- if not is_repo:
- # limit the results to those with sitelinks to target site
- query += ' link[%s]' % site.dbName()
- wd_queryset = wdquery.QuerySet(query)
-
- wd_query = wdquery.WikidataQuery(cacheMaxAge=0)
- data = wd_query.query(wd_queryset)
- # This item count should not be copied by other generators,
- # and should be removed when wdq becomes a real generator (T135592)
- pywikibot.output(u'retrieved %d items' % data[u'status'][u'items'])
- items_pages = (pywikibot.ItemPage(repo, 'Q{0}'.format(item))
- for item in data[u'items'])
- if is_repo:
- return items_pages
-
- return WikidataPageFromItemGenerator(items_pages, site)
def WikidataSPARQLPageGenerator(query, site=None,
diff --git a/tests/__init__.py b/tests/__init__.py
index 9b5de60..b290f68 100644
--- a/tests/__init__.py
+++ b/tests/__init__.py
@@ -110,7 +110,6 @@
'timestripper',
'pagegenerators',
'cosmetic_changes',
- 'wikidataquery',
'wikistats',
'weblib',
'i18n',
diff --git a/tests/aspects.py b/tests/aspects.py
index 0d3ace2..e681d28 100644
--- a/tests/aspects.py
+++ b/tests/aspects.py
@@ -18,9 +18,7 @@
skip if the user is blocked.
sysop flag, implement in site & page, and
possibly some of the script tests.
- labs flag, for wikidataquery
slow flag
- wikiquerydata - quite slow
weblib - also slow
(this class, and a FastTest, could error/pass based
it consumed more than a specified amount of time allowed.)
diff --git a/tests/wikidataquery_tests.py b/tests/wikidataquery_tests.py
deleted file mode 100644
index 3693104..0000000
--- a/tests/wikidataquery_tests.py
+++ /dev/null
@@ -1,290 +0,0 @@
-# -*- coding: utf-8 -*-
-"""Test cases for the WikidataQuery query syntax and API."""
-#
-# (C) Pywikibot team, 2014
-#
-# Distributed under the terms of the MIT license.
-#
-from __future__ import absolute_import, unicode_literals
-
-import os
-import time
-
-import pywikibot
-
-import pywikibot.data.wikidataquery as query
-
-from pywikibot.page import ItemPage, PropertyPage, Claim
-
-from tests.aspects import unittest, WikidataTestCase, TestCase
-
-
-class TestDryApiFunctions(TestCase):
-
- """Test WikiDataQuery API functions."""
-
- net = False
-
- def testQueries(self):
- """
- Test Queries and check whether they're behaving correctly.
-
- Check that we produce the expected query strings and that
- invalid inputs are rejected correctly
- """
- q = query.HasClaim(99)
- self.assertEqual(str(q), "claim[99]")
-
- q = query.HasClaim(99, 100)
- self.assertEqual(str(q), "claim[99:100]")
-
- q = query.HasClaim(99, [100])
- self.assertEqual(str(q), "claim[99:100]")
-
- q = query.HasClaim(99, [100, 101])
- self.assertEqual(str(q), "claim[99:100,101]")
-
- q = query.NoClaim(99, [100, 101])
- self.assertEqual(str(q), "noclaim[99:100,101]")
-
- q = query.StringClaim(99, "Hello")
- self.assertEqual(str(q), 'string[99:"Hello"]')
-
- q = query.StringClaim(99, ["Hello"])
- self.assertEqual(str(q), 'string[99:"Hello"]')
-
- q = query.StringClaim(99, ["Hello", "world"])
- self.assertEqual(str(q), 'string[99:"Hello","world"]')
-
- self.assertRaises(TypeError, lambda: query.StringClaim(99, 2))
-
- q = query.Tree(92, [1], 2)
- self.assertEqual(str(q), 'tree[92][1][2]')
-
- # missing third arg
- q = query.Tree(92, 1)
- self.assertEqual(str(q), 'tree[92][1][]')
-
- # missing second arg
- q = query.Tree(92, reverse=3)
- self.assertEqual(str(q), 'tree[92][][3]')
-
- q = query.Tree([92, 93], 1, [2, 7])
- self.assertEqual(str(q), 'tree[92,93][1][2,7]')
-
- # bad tree arg types
- self.assertRaises(TypeError, lambda: query.Tree(99, "hello"))
-
- q = query.Link("enwiki")
- self.assertEqual(str(q), 'link[enwiki]')
-
- q = query.NoLink(["enwiki", "frwiki"])
- self.assertEqual(str(q), 'nolink[enwiki,frwiki]')
-
- # bad link arg types
- self.assertRaises(TypeError, lambda: query.Link(99))
- self.assertRaises(TypeError, lambda: query.Link([99]))
-
- # HasClaim with tree as arg
- q = query.HasClaim(99, query.Tree(1, 2, 3))
- self.assertEqual(str(q), "claim[99:(tree[1][2][3])]")
-
- q = query.HasClaim(99, query.Tree(1, [2, 5], [3, 90]))
- self.assertEqual(str(q), "claim[99:(tree[1][2,5][3,90])]")
-
-
-class TestLiveApiFunctions(WikidataTestCase):
-
- """Test WikiDataQuery API functions."""
-
- cached = True
-
- def testQueriesWDStructures(self):
- """Test queries using Wikibase page structures like ItemPage."""
- q = query.HasClaim(PropertyPage(self.repo, "P99"))
- self.assertEqual(str(q), "claim[99]")
-
- q = query.HasClaim(PropertyPage(self.repo, "P99"),
- ItemPage(self.repo, "Q100"))
- self.assertEqual(str(q), "claim[99:100]")
-
- q = query.HasClaim(99, [100, PropertyPage(self.repo, "P101")])
- self.assertEqual(str(q), "claim[99:100,101]")
-
- q = query.StringClaim(PropertyPage(self.repo, "P99"), "Hello")
- self.assertEqual(str(q), 'string[99:"Hello"]')
-
- q = query.Tree(ItemPage(self.repo, "Q92"), [1], 2)
- self.assertEqual(str(q), 'tree[92][1][2]')
-
- q = query.Tree(ItemPage(self.repo, "Q92"), [PropertyPage(self.repo, "P101")], 2)
- self.assertEqual(str(q), 'tree[92][101][2]')
-
- self.assertRaises(TypeError, lambda: query.Tree(PropertyPage(self.repo, "P92"),
- [PropertyPage(self.repo, "P101")],
- 2))
-
- c = pywikibot.Coordinate(50, 60)
- q = query.Around(PropertyPage(self.repo, "P625"), c, 23.4)
- self.assertEqual(str(q), 'around[625,50,60,23.4]')
-
- begin = pywikibot.WbTime(site=self.repo, year=1999)
- end = pywikibot.WbTime(site=self.repo, year=2010, hour=1)
-
- # note no second comma
- q = query.Between(PropertyPage(self.repo, "P569"), begin)
- self.assertEqual(str(q), 'between[569,+00000001999-01-01T00:00:00Z]')
-
- q = query.Between(PropertyPage(self.repo, "P569"), end=end)
- self.assertEqual(str(q), 'between[569,,+00000002010-01-01T01:00:00Z]')
-
- q = query.Between(569, begin, end)
- self.assertEqual(str(q),
- 'between[569,+00000001999-01-01T00:00:00Z,+00000002010-01-01T01:00:00Z]')
-
- # try negative year
- begin = pywikibot.WbTime(site=self.repo, year=-44)
- q = query.Between(569, begin, end)
- self.assertEqual(str(q),
- 'between[569,-00000000044-01-01T00:00:00Z,+00000002010-01-01T01:00:00Z]')
-
- def testQueriesDirectFromClaim(self):
- """Test construction of the right Query from a page.Claim."""
- # Datatype: item
- claim = Claim(self.repo, 'P17')
- claim.setTarget(pywikibot.ItemPage(self.repo, 'Q35'))
- q = query.fromClaim(claim)
- self.assertEqual(str(q), 'claim[17:35]')
-
- # Datatype: string
- claim = Claim(self.repo, 'P225')
- claim.setTarget('somestring')
- q = query.fromClaim(claim)
- self.assertEqual(str(q), 'string[225:"somestring"]')
-
- # Datatype: external-id
- claim = Claim(self.repo, 'P268')
- claim.setTarget('somestring')
- q = query.fromClaim(claim)
- self.assertEqual(str(q), 'string[268:"somestring"]')
-
- # Datatype: commonsMedia
- claim = Claim(self.repo, 'P18')
- claim.setTarget(
- pywikibot.FilePage(
- pywikibot.Site(self.family, self.code),
- 'Foo.jpg'))
- q = query.fromClaim(claim)
- self.assertEqual(str(q), 'string[18:"Foo.jpg"]')
-
- def testQuerySets(self):
- """Test that we can join queries together correctly."""
- # construct via queries
- qs = query.HasClaim(99, 100).AND(query.HasClaim(99, 101))
-
- self.assertEqual(str(qs), 'claim[99:100] AND claim[99:101]')
-
- self.assertEqual(repr(qs), 'QuerySet(claim[99:100] AND claim[99:101])')
-
- qs = query.HasClaim(99, 100).AND(query.HasClaim(99, 101)).AND(query.HasClaim(95))
-
- self.assertEqual(str(qs), 'claim[99:100] AND claim[99:101] AND claim[95]')
-
- # construct via queries
- qs = query.HasClaim(99, 100).AND([query.HasClaim(99, 101), query.HasClaim(95)])
-
- self.assertEqual(str(qs), 'claim[99:100] AND claim[99:101] AND claim[95]')
-
- qs = query.HasClaim(99, 100).OR([query.HasClaim(99, 101), query.HasClaim(95)])
-
- self.assertEqual(str(qs), 'claim[99:100] OR claim[99:101] OR claim[95]')
-
- q1 = query.HasClaim(99, 100)
- q2 = query.HasClaim(99, 101)
-
- # different joiners get explicit grouping parens (the api also allows
- # implicit, but we don't do that)
- qs1 = q1.AND(q2)
- qs2 = q1.OR(qs1).AND(query.HasClaim(98))
-
- self.assertEqual(str(qs2),
- '(claim[99:100] OR (claim[99:100] AND claim[99:101])) AND claim[98]')
-
- # if the joiners are the same, no need to group
- qs1 = q1.AND(q2)
- qs2 = q1.AND(qs1).AND(query.HasClaim(98))
-
- self.assertEqual(str(qs2),
- 'claim[99:100] AND claim[99:100] AND claim[99:101] AND claim[98]')
-
- qs1 = query.HasClaim(100).AND(query.HasClaim(101))
- qs2 = qs1.OR(query.HasClaim(102))
-
- self.assertEqual(str(qs2), '(claim[100] AND claim[101]) OR claim[102]')
-
- qs = query.Link("enwiki").AND(query.NoLink("dewiki"))
-
- self.assertEqual(str(qs), 'link[enwiki] AND nolink[dewiki]')
-
- def testQueryApiSyntax(self):
- """Test that we can generate the API query correctly."""
- w = query.WikidataQuery("http://example.com")
-
- qs = w.getQueryString(query.Link("enwiki"))
- self.assertEqual(qs, "q=link%5Benwiki%5D")
-
- self.assertEqual(w.getUrl(qs), "http://example.com/api?q=link%5Benwiki%5D")
-
- # check labels and props work OK
- qs = w.getQueryString(query.Link("enwiki"), ['en', 'fr'], ['prop'])
- self.assertEqual(qs, "q=link%5Benwiki%5D&labels=en,fr&props=prop")
-
-
-class TestApiSlowFunctions(TestCase):
-
- """Test slow WikiDataQuery API functions."""
-
- hostname = 'https://wdq.wmflabs.org/api'
-
- def testQueryApiGetter(self):
- """Test that we can actually retreive data and that caching works."""
- w = query.WikidataQuery(cacheMaxAge=0)
-
- # this query doesn't return any items, save a bit of bandwidth!
- q = query.HasClaim(105).AND([query.NoClaim(225), query.HasClaim(100)])
-
- # check that the cache file is created
- cacheFile = w.getCacheFilename(w.getQueryString(q, [], []))
-
- # remove existing cache file
- try:
- os.remove(cacheFile)
- except OSError:
- pass
-
- data = w.query(q)
-
- self.assertFalse(os.path.exists(cacheFile))
-
- w = query.WikidataQuery(cacheMaxAge=0.1)
-
- data = w.query(q)
-
- self.assertTrue(os.path.exists(cacheFile))
-
- self.assertIn('status', data)
- self.assertIn('items', data)
-
- t1 = time.time()
- data = w.query(q)
- t2 = time.time()
-
- # check that the cache access is fast
- self.assertLess(t2 - t1, 0.2)
-
-
-if __name__ == '__main__': # pragma: no cover
- try:
- unittest.main()
- except SystemExit:
- pass
--
To view, visit https://gerrit.wikimedia.org/r/354590
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings
Gerrit-MessageType: merged
Gerrit-Change-Id: I8dadf884b7255a1fb4af18c8892df3f8c443ab58
Gerrit-PatchSet: 5
Gerrit-Project: pywikibot/core
Gerrit-Branch: master
Gerrit-Owner: Multichill <maarten(a)mdammers.nl>
Gerrit-Reviewer: Jean-Frédéric <jeanfrederic.wiki(a)gmail.com>
Gerrit-Reviewer: Lokal Profil <lokal.profil(a)gmail.com>
Gerrit-Reviewer: Magul <tomasz.magulski(a)gmail.com>
Gerrit-Reviewer: Xqt <info(a)gno.de>
Gerrit-Reviewer: jenkins-bot <>
jenkins-bot has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/336103 )
Change subject: [IMPR] Use bot classes for table2wiki.py
......................................................................
[IMPR] Use bot classes for table2wiki.py
- Remove -skip option which doesn't work for a very long time ago
- rename -auto option with -always and -sql wiht -mysqlquery which
we use for other bots and print a deprecation warning when the old
options are used.
- a new option -skipwarning skips pages with warnings. The old settings
table2wikiAskOnlyWarnings and table2wikiSkipWarnings are deleted
- use treat_page method instead of treat
- use pagegenerators for -namespace option
- use positional_arg_name with 'page' as default
- a new tools method has_module() to check whether a library can be imported
- docs added
Change-Id: Ie50078ae3315ba8ba70946b889a31e403c998dd7
---
M pywikibot/config2.py
M pywikibot/tools/__init__.py
M scripts/table2wiki.py
3 files changed, 115 insertions(+), 134 deletions(-)
Approvals:
Mpaa: Looks good to me, approved
jenkins-bot: Verified
diff --git a/pywikibot/config2.py b/pywikibot/config2.py
index 85ad1a7..9f732f2 100644
--- a/pywikibot/config2.py
+++ b/pywikibot/config2.py
@@ -661,9 +661,6 @@
# sometimes HTML-tables are indented for better reading.
# That can do very ugly results.
deIndentTables = True
-# table2wiki.py works quite stable, so you might switch to True
-table2wikiAskOnlyWarnings = True
-table2wikiSkipWarnings = False
# ############# WEBLINK CHECKER SETTINGS ##############
diff --git a/pywikibot/tools/__init__.py b/pywikibot/tools/__init__.py
index c89bd6a..7e1c9c1 100644
--- a/pywikibot/tools/__init__.py
+++ b/pywikibot/tools/__init__.py
@@ -155,6 +155,16 @@
count = itertools.count
+def has_module(module):
+ """Check whether a module can be imported."""
+ try:
+ __import__(module)
+ except ImportError:
+ return False
+ else:
+ return True
+
+
def empty_iterator():
# http://stackoverflow.com/a/13243870/473890
"""An iterator which does nothing."""
diff --git a/scripts/table2wiki.py b/scripts/table2wiki.py
index b2527f7..82dba5b 100644
--- a/scripts/table2wiki.py
+++ b/scripts/table2wiki.py
@@ -7,32 +7,22 @@
¶ms;
--xml Retrieve information from a local XML dump (pages_current, see
- http://download.wikimedia.org).
+-always The bot won't ask for confirmation when putting a page
+
+-skipwarning Skip processing a page when a warning occurred.
+ Only used when -always is or becomes True.
+
+-quiet Don't show diffs in -always mode
+
+-mysqlquery Retrieve information from a local mirror.
+ Searches for pages with HTML tables, and tries to convert
+ them on the live wiki.
+
+-xml Retrieve information from a local XML dump
+ (pages_current, see http://download.wikimedia.org).
Argument can also be given as "-xml:filename".
- Searches for pages with HTML tables, and tries to convert them
- on the live wiki.
-
--sql Retrieve information from a local mirror.
- Searches for pages with HTML tables, and tries to convert them
- on the live wiki.
-
--namespace:n Number or name of namespace to process. The parameter can be
- used multiple times. It works in combination with all other
- parameters, except for the -start parameter. If you e.g.
- want to iterate over all categories starting at M, use
- -start:Category:M.
-
-This SQL query can be used to find pages to work on:
-
- SELECT CONCAT('[[', cur_title, ']]')
- FROM cur
- WHERE (cur_text LIKE '%<table%'
- OR cur_text LIKE '%<TABLE%')
- AND cur_title REGEXP "^[A-N]"
- AND cur_namespace=0
- ORDER BY cur_title
- LIMIT 500
+ Searches for pages with HTML tables, and tries to convert
+ them on the live wiki.
Example:
@@ -67,6 +57,11 @@
from pywikibot import pagegenerators
from pywikibot import xmlreader
+from pywikibot.bot import (SingleSiteBot, ExistingPageBot, NoRedirectPageBot,
+ suggest_help, input_yn)
+from pywikibot.exceptions import ArgumentDeprecationWarning
+from pywikibot.tools import has_module, issue_deprecation_warning
+
# This is required for the text that is shown when you run this script
# with the parameter -help.
docuReplacements = {
@@ -88,14 +83,23 @@
yield pywikibot.Page(pywikibot.Site(), entry.title)
-class Table2WikiRobot(object):
+class Table2WikiRobot(SingleSiteBot, ExistingPageBot, NoRedirectPageBot):
- """Bot to convert HTML tables to wiki syntax."""
+ """Bot to convert HTML tables to wiki syntax.
- def __init__(self, generator, quietMode=False):
+ @param generator: the page generator that determines on which pages
+ to work
+ @type generator: generator
+ """
+
+ def __init__(self, **kwargs):
"""Constructor."""
- self.generator = generator
- self.quietMode = quietMode
+ self.availableOptions.update({
+ 'quiet': False, # quiet mode, less output
+ 'skipwarning': False # on warning skip that page
+ })
+
+ super(Table2WikiRobot, self).__init__(site=True, **kwargs)
def convertTable(self, table):
"""
@@ -447,14 +451,10 @@
if not table:
# no more HTML tables left
break
- pywikibot.output(">> Table %i <<" % (convertedTables + 1))
+
# convert the current table
newTable, warningsThisTable, warnMsgsThisTable = self.convertTable(
table)
- # show the changes for this table
- if not self.quietMode:
- pywikibot.showDiff(table.replace('##table##', 'table'),
- newTable)
warningSum += warningsThisTable
for msg in warnMsgsThisTable:
warningMessages += 'In table %i: %s' % (convertedTables + 1,
@@ -465,23 +465,10 @@
pywikibot.output(warningMessages)
return text, convertedTables, warningSum
- def treat(self, page):
- """
- Load a page, convert all HTML tables in its text to wiki syntax, and save the result.
-
- Returns True if the converted table was successfully saved, otherwise returns False.
- """
- pywikibot.output(u'\n>>> %s <<<' % page.title())
- site = page.site
- try:
- text = page.get()
- except pywikibot.NoPage:
- pywikibot.error(u"couldn't find %s" % page.title())
- return False
- except pywikibot.IsRedirectPage:
- pywikibot.output(u'Skipping redirect %s' % page.title())
- return False
- newText, convertedTables, warningSum = self.convertAllHTMLTables(text)
+ def treat_page(self):
+ """Convert all HTML tables in text to wiki syntax and save it."""
+ text = self.current_page.text
+ newText, convertedTables, warnings = self.convertAllHTMLTables(text)
# Check if there are any marked tags left
markedTableTagR = re.compile("<##table##|</##table##>", re.IGNORECASE)
@@ -492,33 +479,34 @@
if convertedTables == 0:
pywikibot.output(u"No changes were necessary.")
- else:
- if config.table2wikiAskOnlyWarnings and warningSum == 0:
- doUpload = True
- else:
- if config.table2wikiSkipWarnings:
- doUpload = True
- else:
- pywikibot.output("There were %i replacement(s) that might lead to bad "
- "output." % warningSum)
- doUpload = (pywikibot.input(
- u'Do you want to change the page anyway? [y|N]') == "y")
- if doUpload:
- # get edit summary message
- if warningSum == 0:
- editSummaryMessage = i18n.twtranslate(site.code, 'table2wiki-no-warning')
- else:
- editSummaryMessage = i18n.twntranslate(
- site.code,
- 'table2wiki-warnings',
- {'count': warningSum}
- )
- page.put_async(newText, summary=editSummaryMessage)
+ return
- def run(self):
- """Check each page passed."""
- for page in self.generator:
- self.treat(page)
+ if warnings:
+ if self.getOption('always') and self.getOption('skipwarning'):
+ pywikibot.output(
+ 'There were %i replacements that might lead to bad '
+ 'output. Skipping.' % warnings)
+ return
+ if not self.getOption('always'):
+ pywikibot.output(
+ 'There were %i replacements that might lead to bad '
+ 'output.' % warnings)
+ if not input_yn('Do you want to change the page anyway'):
+ return
+
+ # get edit summary message
+ if warnings == 0:
+ editSummaryMessage = i18n.twtranslate(
+ self.site.code, 'table2wiki-no-warning')
+ else:
+ editSummaryMessage = i18n.twntranslate(
+ self.site.code,
+ 'table2wiki-warnings',
+ {'count': warnings}
+ )
+ self.put_current(newText, summary=editSummaryMessage,
+ show_diff=not (self.getOption('quiet') and
+ self.getOption('always')))
def main(*args):
@@ -530,76 +518,62 @@
@param args: command line arguments
@type args: list of unicode
"""
- quietMode = False # use -quiet to get less output
- # if the -file argument is used, page titles are stored in this array.
- # otherwise it will only contain one page.
- articles = []
- # if -file is not used, this temporary array is used to read the page title.
- page_title = []
-
- # Which namespaces should be processed?
- # default to [] which means all namespaces will be processed
- namespaces = []
-
- xmlfilename = None
+ options = {}
gen = None
+
+ local_args = pywikibot.handle_args(args)
# This factory is responsible for processing command line arguments
# that are also used by other scripts and that determine on which pages
# to work on.
- genFactory = pagegenerators.GeneratorFactory()
+ genFactory = pagegenerators.GeneratorFactory(positional_arg_name='page')
- for arg in pywikibot.handle_args(args):
- if arg.startswith('-xml'):
- if len(arg) == 4:
- xmlfilename = pywikibot.input(
- u'Please enter the XML dump\'s filename:')
- else:
- xmlfilename = arg[5:]
- gen = TableXmlDumpPageGenerator(xmlfilename)
- elif arg == '-sql':
- query = u"""
+ for arg in local_args:
+ option, sep, value = arg.partition(':')
+ if option == '-xml':
+ filename = value or pywikibot.input(
+ "Please enter the XML dump's filename:")
+ gen = TableXmlDumpPageGenerator(filename)
+ elif option == '-auto':
+ issue_deprecation_warning(
+ 'The usage of "-auto"', '-always',
+ 1, ArgumentDeprecationWarning)
+ options['always'] = True
+ elif option in ['-always', '-quiet', '-skipwarning']:
+ options[option[1:]] = True
+ else:
+ if option in ['-sql', '-mysqlquery']:
+ if not (has_module('oursql') or has_module('MySQLdb')):
+ raise NotImplementedError(
+ 'Neither "oursql" nor "MySQLdb" library is installed.')
+ if option == '-sql':
+ issue_deprecation_warning(
+ 'The usage of "-sql"', '-mysqlquery',
+ 1, ArgumentDeprecationWarning)
+
+ query = value or """
SELECT page_namespace, page_title
FROM page JOIN text ON (page_id = old_id)
WHERE old_text LIKE '%<table%'
-LIMIT 200"""
- gen = pagegenerators.MySQLPageGenerator(query)
- elif arg.startswith('-namespace:'):
- try:
- namespaces.append(int(arg[11:]))
- except ValueError:
- namespaces.append(arg[11:])
- elif arg.startswith('-skip:'):
- articles = articles[articles.index(arg[6:]):]
- elif arg.startswith('-auto'):
- config.table2wikiAskOnlyWarnings = True
- config.table2wikiSkipWarnings = True
- pywikibot.output('Automatic mode!\n')
- elif arg.startswith('-quiet'):
- quietMode = True
- else:
- if not genFactory.handleArg(arg):
- page_title.append(arg)
+"""
+ arg = '-mysqlquery:' + query
+ genFactory.handleArg(arg)
- # if the page is given as a command line argument,
- # connect the title's parts with spaces
- if page_title != []:
- page_title = ' '.join(page_title)
- page = pywikibot.Page(pywikibot.Site(), page_title)
- gen = iter([page])
-
- if not gen:
+ if gen:
+ gen = pagegenerators.NamespaceFilterPageGenerator(
+ gen, genFactory.namespaces)
+ else:
gen = genFactory.getCombinedGenerator()
if gen:
- if namespaces != []:
- gen = pagegenerators.NamespaceFilterPageGenerator(gen, namespaces)
if not genFactory.nopreload:
gen = pagegenerators.PreloadingGenerator(gen)
- bot = Table2WikiRobot(gen, quietMode)
+ bot = Table2WikiRobot(generator=gen, **options)
bot.run()
+ return True
else:
- pywikibot.showHelp('table2wiki')
+ suggest_help(missing_generator=True)
+ return False
if __name__ == "__main__":
--
To view, visit https://gerrit.wikimedia.org/r/336103
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings
Gerrit-MessageType: merged
Gerrit-Change-Id: Ie50078ae3315ba8ba70946b889a31e403c998dd7
Gerrit-PatchSet: 5
Gerrit-Project: pywikibot/core
Gerrit-Branch: master
Gerrit-Owner: Xqt <info(a)gno.de>
Gerrit-Reviewer: John Vandenberg <jayvdb(a)gmail.com>
Gerrit-Reviewer: Magul <tomasz.magulski(a)gmail.com>
Gerrit-Reviewer: Matěj Suchánek <matejsuchanek97(a)gmail.com>
Gerrit-Reviewer: Mpaa <mpaa.wiki(a)gmail.com>
Gerrit-Reviewer: Phantom42 <nikitav30(a)gmail.com>
Gerrit-Reviewer: Xqt <info(a)gno.de>
Gerrit-Reviewer: Zhuyifei1999 <zhuyifei1999(a)gmail.com>
Gerrit-Reviewer: jenkins-bot <>