Patches item #1904587, was opened at 2008-02-29 05:55
Message generated for change (Comment added) made by russblau
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=1904587&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
>Status: Closed
>Resolution: Accepted
Priority: 5
Private: No
Submitted By: Purodha B Blissenbach (purodha)
Assigned to: Nobody/Anonymous (nobody)
Summary: Interwiki.py - better language fallbacks: dsb, hsb, stq
Initial Comment:
svn diff wikipedia.py
Index: wikipedia.py
===================================================================
--- wikipedia.py (revision 5095)
+++ wikipedia.py (working copy)
@@ -5525,10 +5527,14 @@
return ['ar','tr']
if code=='sk':
return ['cs']
- if code in ['bar','hsb','ksh']:
+ if code in ['bar','ksh','stq']:
return ['de']
if code in ['als','lb']:
return ['de','fr']
+ if code=='dsb':
+ return ['hsb','de']
+ if code=='hsb':
+ return ['dsb','de']
if code=='io':
return ['eo']
if code in ['an','ast','ay','ca','gn','nah','qu']:
----
Adds Saterlandic Frisian (Seeltersk)
Makes Upper/Lower Sorbian being fallbacks for each other before resorting to German.
----------------------------------------------------------------------
>Comment By: Russell Blau (russblau)
Date: 2008-03-14 14:13
Message:
Logged In: YES
user_id=855050
Originator: NO
Applied in r5129.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=1904587&group_…
Bugs item #1899422, was opened at 2008-02-22 09:54
Message generated for change (Comment added) made by multichill
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1899422&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Nobody/Anonymous (nobody)
Assigned to: Nobody/Anonymous (nobody)
Summary: tlh to be removed from wiktionaries
Initial Comment:
Interwiki links to 'tlh' don't work any longer in wiktionaries. Klingon wiktionary seems to be closed. Interwiki links should be removed.
----------------------------------------------------------------------
Comment By: Multichill (multichill)
Date: 2008-03-14 15:01
Message:
Logged In: YES
user_id=1777493
Originator: NO
Looks like the language is remove in 5119. Rotem, can you also add this
line so that leftover tlh links are cleaned up?
@@ -341,6 +340,7 @@
'mo': None, #
http://meta.wikimedia.org/wiki/Proposals_for_closing_projects/Closure_of_Mo…
'minnan':'zh-min-nan',
'nb': 'no',
+ 'tlh': None, # Remove Klingon
'tokipona': None,
'zh-tw': 'zh',
'zh-cn': 'zh'
----------------------------------------------------------------------
Comment By: Multichill (multichill)
Date: 2008-03-01 12:31
Message:
Logged In: YES
user_id=1777493
Originator: NO
Lets close it, see
http://meta.wikimedia.org/wiki/Proposals_for_closing_projects/Closure_of_Kl…
----------------------------------------------------------------------
Comment By: spacebirdy (real-spacebirdy)
Date: 2008-02-25 10:36
Message:
Logged In: YES
user_id=2018837
Originator: NO
tlh wiktionary is not closed yet,
_but_ the interwikilinks do _not_ work neither to the Wiktionary nor to
the Wikipedia.
Please see https://bugzilla.wikimedia.org/show_bug.cgi?id=9164
The tlh-projects if linked show up in the entry instead of on the left and
therefore breaking the entry.
Please remove tlh.
Many thanks in advance,
best regards.
----------------------------------------------------------------------
Comment By: Andre Engels (a_engels)
Date: 2008-02-25 09:18
Message:
Logged In: YES
user_id=843018
Originator: NO
Re-opening: Interwiki links to tlh: don't work any more, so it's
worse-than-silly to keep them. Please remove tlh from the language list and
add it to the obsolete list.
----------------------------------------------------------------------
Comment By: Rotem Liss (rotemliss)
Date: 2008-02-22 16:34
Message:
Logged In: YES
user_id=1327030
Originator: NO
wiktionary:tlh is not closed yet.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1899422&group_…
Bugs item #1914247, was opened at 2008-03-14 15:21
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1914247&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: General
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Soroush (soroush)
Assigned to: Nobody/Anonymous (nobody)
Summary: i18n for Persian for redirect.py
Initial Comment:
Please add the following to redirect.py or replace the current version with the atatched file to solve Persian(=Farsi) translations:
for
'en': u'Robot: Fixing double redirect',
please add:
'fa': u'ربات:اصلاح تغییر مسیر دوتایی';
and for
'en': u'Robot: Redirect target doesn\'t exist',
add
'fa': u'ربات:تغییرمسیر مقصد ندارد';
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1914247&group_…
Bugs item #1783572, was opened at 2007-08-28 22:11
Message generated for change (Comment added) made by soroush
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1783572&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: other
Group: None
Status: Closed
Resolution: Fixed
Priority: 5
Private: No
Submitted By: Persian Gulf (persian_gulf)
Assigned to: Nobody/Anonymous (nobody)
Summary: Necessary translation for redirect in wikipedia_family.py
Initial Comment:
This translation should exist in wikipedia_family.py to make $redirect * double work.
The file with added translation is included.
added translation:
self.redirect{
'fa' : u'تغییرمسیر',
}
This works well after the following bug is solved:
[ pywikipediabot-Bugs-1783561 ] a regex bug on line
----------------------------------------------------------------------
Comment By: Soroush (soroush)
Date: 2008-03-14 15:04
Message:
Logged In: YES
user_id=1916127
Originator: NO
Me, whose previous id was persian_gulf looked at family.py and this bug
has been fixed. Closed is ok.
----------------------------------------------------------------------
Comment By: Russell Blau (russblau)
Date: 2007-11-29 19:45
Message:
Logged In: YES
user_id=855050
Originator: NO
It appears that this translation has been added to family.py so this bug
may be fixed (someone who actually reads Farsi should verify this).
----------------------------------------------------------------------
Comment By: Persian Gulf (persian_gulf)
Date: 2007-09-03 16:22
Message:
Logged In: YES
user_id=1710835
Originator: YES
This works very well for Persian wikipedia(I tested it for about 500 ones
and if you want to make advantage of this for Persian Wiktionary as well,
such this line has to be added to wiktionary_familly.py file
----------------------------------------------------------------------
Comment By: Daniel Herding (wikipedian)
Date: 2007-09-03 03:25
Message:
Logged In: YES
user_id=880694
Originator: NO
Does this really only work for the Farsi Wikipedia, not for the Farsi
Wiktionary etc.?
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1783572&group_…
Revision: 5127
Author: russblau
Date: 2008-03-13 21:12:29 +0000 (Thu, 13 Mar 2008)
Log Message:
-----------
Add throttling, add Page tests
Modified Paths:
--------------
branches/rewrite/pywikibot/__init__.py
branches/rewrite/pywikibot/config.py
branches/rewrite/pywikibot/data/api.py
branches/rewrite/pywikibot/site.py
branches/rewrite/pywikibot/tests/page_tests.py
Added Paths:
-----------
branches/rewrite/pywikibot/throttle.py
Modified: branches/rewrite/pywikibot/__init__.py
===================================================================
--- branches/rewrite/pywikibot/__init__.py 2008-03-13 12:48:27 UTC (rev 5126)
+++ branches/rewrite/pywikibot/__init__.py 2008-03-13 21:12:29 UTC (rev 5127)
@@ -51,6 +51,8 @@
key = '%s:%s:%s' % (fam, code, user)
if not _sites.has_key(key):
_sites[key] = __Site(code=code, fam=fam, user=user)
+ _sites[key].getsiteinfo()
+ _sites[key].login(False)
return _sites[key]
getSite = Site # alias for backwards-compability
@@ -68,3 +70,17 @@
import logging
logging.getLogger().setLevel(logging.DEBUG)
+
+def stopme():
+ """Drop this process from the throttle log.
+
+ Can be called manually if desired, but if not, will be called automatically
+ at Python exit.
+
+ """
+ # only need one drop() call because all throttles use the same global pid
+ Site().get_throttle.drop()
+
+import atexit
+atexit.register(stopme)
+
Modified: branches/rewrite/pywikibot/config.py
===================================================================
--- branches/rewrite/pywikibot/config.py 2008-03-13 12:48:27 UTC (rev 5126)
+++ branches/rewrite/pywikibot/config.py 2008-03-13 21:12:29 UTC (rev 5127)
@@ -288,14 +288,18 @@
# Slow down the robot such that it never makes a second change within
# 'put_throttle' seconds.
put_throttle = 10
+# By default, the get_throttle is turned off, and 'maxlag' is used to
+# control the rate of server access. Set this to non-zero to use a throttle
+# on read access.
+get_throttle = 0
# Sometimes you want to know when a delay is inserted. If a delay is larger
# than 'noisysleep' seconds, it is logged on the screen.
noisysleep = 3.0
# Defer bot edits during periods of database server lag. For details, see
# http://www.mediawiki.org/wiki/Maxlag_parameter
-# You can set this variable to a number of seconds, or to None to disable
-# this behavior.
+# You can set this variable to a number of seconds, or to None (or 0) to
+# disable this behavior.
# It is recommended that you do not change this parameter unless you know
# what you are doing and have a good reason for it!
maxlag = 5
Modified: branches/rewrite/pywikibot/data/api.py
===================================================================
--- branches/rewrite/pywikibot/data/api.py 2008-03-13 12:48:27 UTC (rev 5126)
+++ branches/rewrite/pywikibot/data/api.py 2008-03-13 21:12:29 UTC (rev 5127)
@@ -19,10 +19,10 @@
import time
import urllib
+import config
import pywikibot
from pywikibot import login
-
lagpattern = re.compile(r"Waiting for [\d.]+: (?P<lag>\d+) seconds? lagged")
@@ -94,7 +94,7 @@
if "format" not in kwargs:
self.params["format"] = "json"
if "maxlag" not in kwargs:
- self.params["maxlag"] = "5" # replace with configurable constant?
+ self.params["maxlag"] = str(config.maxlag)
self.update(**kwargs)
# implement dict interface
@@ -229,6 +229,7 @@
# following "if" is used for testing with plugged-in data; it wouldn't
# be needed for actual usage
if not hasattr(self, "data"):
+ site.get_throttle()
self.data = self.request.submit()
if not self.data or not isinstance(self.data, dict):
raise StopIteration
Modified: branches/rewrite/pywikibot/site.py
===================================================================
--- branches/rewrite/pywikibot/site.py 2008-03-13 12:48:27 UTC (rev 5126)
+++ branches/rewrite/pywikibot/site.py 2008-03-13 21:12:29 UTC (rev 5127)
@@ -11,7 +11,9 @@
__version__ = '$Id: $'
import pywikibot
+from pywikibot.throttle import Throttle
from pywikibot.data import api
+import config
import os
import threading
@@ -99,6 +101,14 @@
self._mutex = threading.Lock()
self._locked_pages = []
+ pt_min = min(config.minthrottle, config.put_throttle)
+ self.put_throttle = Throttle(pt_min, config.maxthrottle)
+ self.put_throttle.setDelay(config.put_throttle)
+
+ gt_min = min(config.minthrottle, config.get_throttle)
+ self.get_throttle = Throttle(gt_min, config.maxthrottle)
+ self.get_throttle.setDelay(config.get_throttle)
+
def family(self):
"""Return the associated Family object."""
return self._family
@@ -188,7 +198,7 @@
finally:
self._mutex.release()
-
+
class APISite(BaseSite):
"""API interface to MediaWiki site.
@@ -500,7 +510,7 @@
self.getsiteinfo()
return self._namespaces
- def namespace(self, num, all = False):
+ def namespace(self, num, all=False):
"""Return string containing local name of namespace 'num'.
If optional argument 'all' is true, return a list of all recognized
@@ -528,9 +538,15 @@
in this list.
"""
+ if 'bot' in self.getuserinfo()['groups']:
+ limit = 5000
+ else:
+ limit = 500
+ if followRedirects:
+ limit = limit / 2
bltitle = page.title(withSection=False)
blgen = api.PageGenerator("backlinks", gbltitle=bltitle,
- gbllimit="5000")
+ gbllimit=str(limit))
if namespaces is not None:
blgen.request["gblnamespace"] = u"|".join(unicode(ns)
for ns in namespaces)
Modified: branches/rewrite/pywikibot/tests/page_tests.py
===================================================================
--- branches/rewrite/pywikibot/tests/page_tests.py 2008-03-13 12:48:27 UTC (rev 5126)
+++ branches/rewrite/pywikibot/tests/page_tests.py 2008-03-13 21:12:29 UTC (rev 5127)
@@ -48,7 +48,7 @@
u"Hispanic (U.S. Census)" : u"Hispanic (U.S. Census)",
u"Stołpce" : u"Stołpce",
u"Nowy_Sącz" : u"Nowy Sącz",
- u"battle of Węgierska Górka" : u"Battle of Węgierska Górka",
+ u"battle of Węgierska Górka" : u"Battle of Węgierska Górka",
}
# random bunch of possible section titles
sections = [u"",
@@ -69,7 +69,17 @@
site)
self.assertEqual(m.namespace, num)
+ def testTitles(self):
+ """Test that Link() normalizes titles"""
+ for title in self.titles:
+ for num in (0, 1):
+ l = pywikibot.page.Link(self.namespaces[num][0]+title)
+ self.assertEqual(l.title, self.titles[title])
+ # prefixing name with ":" shouldn't change result
+ m = pywikibot.page.Link(":"+self.namespaces[num][0]+title)
+ self.assertEqual(m.title, self.titles[title])
+
class TestPageObject(unittest.TestCase):
def testSite(self):
"""Test site() method"""
Added: branches/rewrite/pywikibot/throttle.py
===================================================================
--- branches/rewrite/pywikibot/throttle.py (rev 0)
+++ branches/rewrite/pywikibot/throttle.py 2008-03-13 21:12:29 UTC (rev 5127)
@@ -0,0 +1,200 @@
+# -*- coding: utf-8 -*-
+"""
+Mechanics to slow down wiki read and/or write rate.
+"""
+#
+# (C) Pywikipedia bot team, 2008
+#
+# Distributed under the terms of the MIT license.
+#
+__version__ = '$Id: $'
+
+import config
+import pywikibot
+
+import logging
+import threading
+import time
+
+pid = False # global process identifier
+ # Don't check for other processes unless this is set
+
+
+class Throttle(object):
+ """Control rate of access to wiki server
+
+ Calling this object blocks the calling thread until at least 'delay'
+ seconds have passed since the previous call.
+
+ Each Site initiates two Throttle objects: get_throttle to control
+ the rate of read access, and put_throttle to control the rate of write
+ access. These are available as the Site.get_throttle and Site.put_throttle
+ objects.
+
+ """
+ def __init__(self, mindelay=config.minthrottle,
+ maxdelay=config.maxthrottle,
+ multiplydelay=True):
+ self.lock = threading.RLock()
+ self.mindelay = mindelay
+ self.maxdelay = maxdelay
+ self.now = 0
+ self.next_multiplicity = 1.0
+ self.checkdelay = 240 # Check logfile again after this many seconds
+ self.dropdelay = 360 # Ignore processes that have not made
+ # a check in this many seconds
+ self.releasepid = 1800 # Free the process id after this many seconds
+ self.lastwait = 0.0
+ self.delay = 0
+ if multiplydelay:
+ self.checkMultiplicity()
+ self.setDelay(mindelay)
+
+ def logfn(self):
+ return config.datafilepath('throttle.log')
+
+ def checkMultiplicity(self):
+ global pid
+ self.lock.acquire()
+ logging.debug("Checking multiplicity: pid = %s" % pid)
+ try:
+ processes = {}
+ my_pid = 1
+ count = 1
+ try:
+ f = open(self.logfn(), 'r')
+ except IOError:
+ if not pid:
+ pass
+ else:
+ raise
+ else:
+ now = time.time()
+ for line in f.readlines():
+ try:
+ line = line.split(' ')
+ this_pid = int(line[0])
+ ptime = int(line[1].split('.')[0])
+ if now - ptime <= self.releasepid:
+ if now - ptime <= self.dropdelay \
+ and this_pid != pid:
+ count += 1
+ processes[this_pid] = ptime
+ if this_pid >= my_pid:
+ my_pid = this_pid+1
+ except (IndexError, ValueError):
+ pass # Sometimes the file gets corrupted
+ # ignore that line
+
+ if not pid:
+ pid = my_pid
+ self.checktime = time.time()
+ processes[pid] = self.checktime
+ f = open(self.logfn(), 'w')
+ for p in processes.keys():
+ f.write(str(p)+' '+str(processes[p])+'\n')
+ f.close()
+ self.process_multiplicity = count
+ pywikibot.output(
+ u"Found %s processes running, including the current process."
+ % count)
+ finally:
+ self.lock.release()
+
+ def setDelay(self, delay=config.minthrottle, absolute=False):
+ """Set the nominal delay in seconds."""
+ self.lock.acquire()
+ try:
+ if absolute:
+ self.maxdelay = delay
+ self.mindelay = delay
+ self.delay = delay
+ # Start the delay count now, not at the next check
+ self.now = time.time()
+ finally:
+ self.lock.release()
+
+ def getDelay(self):
+ """Return the actual delay, accounting for multiple processes.
+
+ This value is the maximum wait between reads/writes, not taking
+ account of how much time has elapsed since the last access.
+
+ """
+ global pid
+ thisdelay = self.delay
+ if pid: # If set, we're checking for multiple processes
+ if time.time() > self.checktime + self.checkdelay:
+ self.checkMultiplicity()
+ if thisdelay < (self.mindelay * self.next_multiplicity):
+ thisdelay = self.mindelay * self.next_multiplicity
+ elif thisdelay > self.maxdelay:
+ thisdelay = self.maxdelay
+ thisdelay *= self.process_multiplicity
+ return thisdelay
+
+ def waittime(self):
+ """Return waiting time in seconds if a query would be made right now"""
+ # Take the previous requestsize in account calculating the desired
+ # delay this time
+ thisdelay = self.getDelay()
+ now = time.time()
+ ago = now - self.now
+ if ago < thisdelay:
+ delta = thisdelay - ago
+ return delta
+ else:
+ return 0.0
+
+ def drop(self):
+ """Remove me from the list of running bots processes."""
+ self.checktime = 0
+ processes = {}
+ try:
+ f = open(self.logfn(), 'r')
+ except IOError:
+ return
+ else:
+ now = time.time()
+ for line in f.readlines():
+ try:
+ line = line.split(' ')
+ this_pid = int(line[0])
+ ptime = int(line[1].split('.')[0])
+ if now - ptime <= self.releasepid and this_pid != pid:
+ processes[this_pid] = ptime
+ except (IndexError,ValueError):
+ pass # Sometimes the file gets corrupted - ignore that line
+ f = open(self.logfn(), 'w')
+ for p in processes.keys():
+ f.write(str(p)+' '+str(processes[p])+'\n')
+ f.close()
+
+ def __call__(self, requestsize=1):
+ """
+ Block the calling program if the throttle time has not expired.
+
+ Parameter requestsize is the number of Pages to be read/written;
+ multiply delay time by an appropriate factor.
+ """
+ self.lock.acquire()
+ try:
+ waittime = self.waittime()
+ # Calculate the multiplicity of the next delay based on how
+ # big the request is that is being posted now.
+ # We want to add "one delay" for each factor of two in the
+ # size of the request. Getting 64 pages at once allows 6 times
+ # the delay time for the server.
+ self.next_multiplicity = math.log(1+requestsize)/math.log(2.0)
+ # Announce the delay if it exceeds a preset limit
+ if waittime > config.noisysleep:
+ pywikibot.output(u"Sleeping for %.1f seconds, %s"
+ % (waittime,
+ time.strftime("%Y-%m-%d %H:%M:%S",
+ time.localtime()))
+ )
+ time.sleep(waittime)
+ self.now = time.time()
+ finally:
+ self.lock.release()
+
Bugs item #1913728, was opened at 2008-03-13 14:02
Message generated for change (Comment added) made by russblau
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1913728&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: General
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Nobody/Anonymous (nobody)
Assigned to: Nobody/Anonymous (nobody)
Summary: Wikipedia.py ImagePage() Bug
Initial Comment:
ImagePage() Not Support utf-8 decode?
see:
'utf8' codec can't decode bytes in position 13482-13484: invalid data
ERROR: Invalid characters found on http://zh.wikipedia.org/w/index.php?title=Image%3AEq2eastbadtranslation.jpg…, replaced by \ufffd.
source code see :
http://botwiki.sno.cc/wiki/Python:Deledpimage.py
----------------------------------------------------------------------
>Comment By: Russell Blau (russblau)
Date: 2008-03-13 14:22
Message:
Logged In: YES
user_id=855050
Originator: NO
Please provide the entire traceback that was printed with the exception.
It doesn't do much good to luck at the source code of your bot if we don't
know which line caused the exception.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1913728&group_…
Bugs item #1913728, was opened at 2008-03-13 11:02
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1913728&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: General
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Nobody/Anonymous (nobody)
Assigned to: Nobody/Anonymous (nobody)
Summary: Wikipedia.py ImagePage() Bug
Initial Comment:
ImagePage() Not Support utf-8 decode?
see:
'utf8' codec can't decode bytes in position 13482-13484: invalid data
ERROR: Invalid characters found on http://zh.wikipedia.org/w/index.php?title=Image%3AEq2eastbadtranslation.jpg…, replaced by \ufffd.
source code see :
http://botwiki.sno.cc/wiki/Python:Deledpimage.py
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1913728&group_…
Revision: 5126
Author: russblau
Date: 2008-03-13 12:48:27 +0000 (Thu, 13 Mar 2008)
Log Message:
-----------
Code and docstring cleanup; no functional changes.
Modified Paths:
--------------
trunk/pywikipedia/category.py
Modified: trunk/pywikipedia/category.py
===================================================================
--- trunk/pywikipedia/category.py 2008-03-11 20:32:34 UTC (rev 5125)
+++ trunk/pywikipedia/category.py 2008-03-13 12:48:27 UTC (rev 5126)
@@ -77,7 +77,7 @@
#
# Distributed under the terms of the MIT license.
#
-import os, re, sys, string, pickle, bz2
+import os, re, sys, pickle, bz2
import wikipedia, catlib, config, pagegenerators
# This is required for the text that is shown when you run this script
@@ -270,13 +270,16 @@
f.close()
def sorted_by_last_name(catlink, pagelink):
- '''
- given a Category, returns a Category which has an explicit sort key which
- sorts persons by their last names.
- Trailing words in brackets will be removed.
- Example: If category_name is 'Author' and pl is a Page to
- [[Alexandre Dumas (senior)]], this function will return this Category:
+ '''Return a Category with key that sorts persons by their last names.
+
+ Parameters: catlink - The Category to be linked
+ pagelink - the Page to be placed in the category
+
+ Trailing words in brackets will be removed. Example: If
+ category_name is 'Author' and pl is a Page to [[Alexandre Dumas
+ (senior)]], this function will return this Category:
[[Category:Author|Dumas, Alexandre]]
+
'''
page_name = pagelink.title()
site = pagelink.site()
@@ -288,28 +291,28 @@
page_name = match_object.group(1)
split_string = page_name.split(' ')
if len(split_string) > 1:
- # pull last part of the name to the beginning, and append the rest after a comma
- # e.g. "John von Neumann" becomes "Neumann, John von"
- sorted_key = split_string[-1] + ', ' + string.join(split_string[:-1], ' ')
+ # pull last part of the name to the beginning, and append the
+ # rest after a comma; e.g., "John von Neumann" becomes
+ # "Neumann, John von"
+ sorted_key = split_string[-1] + ', ' + ' '.join(split_string[:-1])
# give explicit sort key
return wikipedia.Page(site, catlink.title() + '|' + sorted_key)
else:
return wikipedia.Page(site, catlink.title())
def add_category(sort_by_last_name = False):
- '''
- A robot to mass-add a category to a list of pages.
- '''
+ '''A robot to mass-add a category to a list of pages.'''
site = wikipedia.getSite()
if gen:
- newcatTitle = wikipedia.input(u'Category to add (do not give namespace):')
- if not wikipedia.getSite().nocapitalize:
+ newcatTitle = wikipedia.input(
+ u'Category to add (do not give namespace):')
+ if not site.nocapitalize:
newcatTitle = newcatTitle[:1].capitalize() + newcatTitle[1:]
# set edit summary message
- wikipedia.setAction(wikipedia.translate(wikipedia.getSite(), msg_add) % newcatTitle)
+ wikipedia.setAction(wikipedia.translate(site, msg_add) % newcatTitle)
- cat_namespace = wikipedia.getSite().category_namespaces()[0]
+ cat_namespace = site.category_namespaces()[0]
answer = ''
for page in gen:
@@ -321,7 +324,9 @@
if answer == 'a':
confirm = ''
while confirm not in ('y','n'):
- confirm = wikipedia.input(u'This should be used if and only if you are sure that your links are correct! Are you sure? [y/n]:')
+ confirm = wikipedia.input(u"""\
+This should be used if and only if you are sure that your links are correct!
+Are you sure? [y/n]:""")
if confirm == 'n':
answer = ''
@@ -329,24 +334,31 @@
try:
text = page.get()
except wikipedia.NoPage:
- wikipedia.output(u"%s doesn't exist yet. Ignoring." % (page.title()))
+ wikipedia.output(u"%s doesn't exist yet. Ignoring."
+ % (page.title()))
pass
- except wikipedia.IsRedirectPage,arg:
- redirTarget = wikipedia.Page(site,arg.args[0])
- wikipedia.output(u"WARNING: %s is redirect to %s. Ignoring." % (page.title(), redirTarget.title()))
+ except wikipedia.IsRedirectPage, arg:
+ redirTarget = wikipedia.Page(site, arg.args[0])
+ wikipedia.output(
+ u"WARNING: %s is redirect to %s. Ignoring."
+ % (page.title(), redirTarget.title()))
else:
cats = page.categories()
# Show the title of the page we're working on.
# Highlight the title in purple.
- wikipedia.output(u"\n\n>>> \03{lightpurple}%s\03{default} <<<" % page.title())
+ wikipedia.output(
+ u"\n\n>>> \03{lightpurple}%s\03{default} <<<"
+ % page.title())
wikipedia.output(u"Current categories:")
for cat in cats:
wikipedia.output(u"* %s" % cat.title())
- catpl = wikipedia.Page(site, cat_namespace + ':' + newcatTitle)
+ catpl = wikipedia.Page(site,
+ cat_namespace + ':' + newcatTitle)
if sort_by_last_name:
catpl = sorted_by_last_name(catpl, page)
if catpl in cats:
- wikipedia.output(u"%s is already in %s." % (page.title(), catpl.title()))
+ wikipedia.output(u"%s is already in %s."
+ % (page.title(), catpl.title()))
else:
wikipedia.output(u'Adding %s' % catpl.aslink())
cats.append(catpl)
@@ -355,12 +367,18 @@
try:
page.put(text)
except wikipedia.EditConflict:
- wikipedia.output(u'Skipping %s because of edit conflict' % (page.title()))
+ wikipedia.output(
+ u'Skipping %s because of edit conflict'
+ % (page.title()))
class CategoryMoveRobot:
- def __init__(self, oldCatTitle, newCatTitle, batchMode = False, editSummary = '', inPlace = False, moveCatPage = True, deleteEmptySourceCat = True, titleRegex = None):
+ """Robot to move pages from one category to another."""
+ def __init__(self, oldCatTitle, newCatTitle, batchMode=False,
+ editSummary='', inPlace=False, moveCatPage=True,
+ deleteEmptySourceCat=True, titleRegex=None):
+ site = wikipedia.getSite()
self.editSummary = editSummary
- self.oldCat = catlib.Category(wikipedia.getSite(), 'Category:' + oldCatTitle)
+ self.oldCat = catlib.Category(site, 'Category:' + oldCatTitle)
self.newCatTitle = newCatTitle
self.inPlace = inPlace
self.moveCatPage = moveCatPage
@@ -371,19 +389,24 @@
if self.editSummary:
wikipedia.setAction(self.editSummary)
else:
- wikipedia.setAction(wikipedia.translate(wikipedia.getSite(),msg_change) % self.oldCat.title())
+ wikipedia.setAction(wikipedia.translate(site, msg_change)
+ % self.oldCat.title())
def run(self):
- newCat = catlib.Category(wikipedia.getSite(), 'Category:' + self.newCatTitle)
+ site = wikipedia.getSite()
+ newCat = catlib.Category(site, 'Category:' + self.newCatTitle)
# Copy the category contents to the new category page
copied = False
oldMovedTalk = None
if self.oldCat.exists() and self.moveCatPage:
- copied = self.oldCat.copyAndKeep(self.newCatTitle, wikipedia.translate(wikipedia.getSite(), cfd_templates))
+ copied = self.oldCat.copyAndKeep(
+ self.newCatTitle,
+ wikipedia.translate(site, cfd_templates))
# Also move the talk page
if copied:
- reason = wikipedia.translate(wikipedia.getSite(), deletion_reason_move) % (self.newCatTitle, self.newCatTitle)
+ reason = wikipedia.translate(site, deletion_reason_move) \
+ % (self.newCatTitle, self.newCatTitle)
oldTalk = self.oldCat.toggleTalkPage()
if oldTalk.exists():
newTalkTitle = newCat.toggleTalkPage().title()
@@ -391,30 +414,39 @@
oldMovedTalk = oldTalk
# Move articles
- gen = pagegenerators.CategorizedPageGenerator(self.oldCat, recurse = False)
+ gen = pagegenerators.CategorizedPageGenerator(self.oldCat,
+ recurse=False)
preloadingGen = pagegenerators.PreloadingGenerator(gen)
for article in preloadingGen:
- if not self.titleRegex or re.search(self.titleRegex,article.title()):
- catlib.change_category(article, self.oldCat, newCat, inPlace=self.inPlace)
+ if not self.titleRegex or re.search(self.titleRegex,
+ article.title()):
+ catlib.change_category(article, self.oldCat, newCat,
+ inPlace=self.inPlace)
# Move subcategories
- gen = pagegenerators.SubCategoriesPageGenerator(self.oldCat, recurse = False)
+ gen = pagegenerators.SubCategoriesPageGenerator(self.oldCat,
+ recurse=False)
preloadingGen = pagegenerators.PreloadingGenerator(gen)
for subcategory in preloadingGen:
- if not self.titleRegex or re.search(self.titleRegex,subcategory.title()):
- catlib.change_category(subcategory, self.oldCat, newCat, inPlace=self.inPlace)
+ if not self.titleRegex or re.search(self.titleRegex,
+ subcategory.title()):
+ catlib.change_category(subcategory, self.oldCat, newCat,
+ inPlace=self.inPlace)
# Delete the old category and its moved talk page
if copied and self.deleteEmptySourceCat == True:
if self.oldCat.isEmpty():
- reason = wikipedia.translate(wikipedia.getSite(), deletion_reason_move) % (self.newCatTitle, self.newCatTitle)
+ reason = wikipedia.translate(site, deletion_reason_move) \
+ % (self.newCatTitle, self.newCatTitle)
confirm = not self.batchMode
self.oldCat.delete(reason, confirm, mark = True)
if oldMovedTalk is not None:
oldMovedTalk.delete(reason, confirm, mark = True)
else:
- wikipedia.output('Couldn\'t delete %s - not empty.' % self.oldCat.title())
+ wikipedia.output('Couldn\'t delete %s - not empty.'
+ % self.oldCat.title())
+
class CategoryListifyRobot:
'''
Creates a list containing all of the members in a category.