Feature Requests item #2170330, was opened at 2008-10-16 03:20
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603141&aid=2170330&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: interwiki
Group: None
Status: Open
Priority: 5
Private: No
Submitted By: Purodha B Blissenbach (purodha)
Assigned to: Nobody/Anonymous (nobody)
Summary: add -hint:latin to interwiki.py
Initial Comment:
interwiki.py already supports -hint:cyril so as to select all wikis of a familiy using the cyrillic script.
Adding -hint:latin only appears logical, an will be useful e.g. when dealing with proper names that
can most often be expected to be spellt alike in almost all languages using the Latin script.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603141&aid=2170330&group_…
Bugs item #2169485, was opened at 2008-10-15 22:56
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=2169485&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: giurrero (giurrero)
Assigned to: Nobody/Anonymous (nobody)
Summary: image.py bug
Initial Comment:
image.py has a bug here:
if not site.nocapitalize:
old = '[' + self.oldImage[0].upper() + self.oldImage[0].lower() + ']' + self.oldImage[1:]
else:
old = self.oldImage
old = re.sub('[_ ]', '[_ ]', old)
escaped = re.escape(old)
if not self.loose or not self.newImage:
ImageRegex = re.compile(r'\[\[ *(?:' + '|'.join(site.namespace(6, all = True)) + ')\s*:\s*' + escaped + ' *(?P<parameters>\|[^\n]+|) *\]\]')
else:
ImageRegex = re.compile(r'' + escaped)
the escaping must be the first thing that you do, now if you replace [_ ] with [_ ], and do the escaping:
"my_image" -> "my[ _]image" -> "my\[\ \_\]image"
I think that the solution in wikipedia.py replaceImage will be the best
Pywikipedia [http] trunk/pywikipedia (r5976, Oct 15 2008, 17:28:48)
Python 2.5.2 (r252:60911, Aug 1 2008, 00:37:21)
[GCC 4.3.1 20080507 (prerelease) [gcc-4_3-branch revision 135036]]
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=2169485&group_…
Revision: 5977
Author: russblau
Date: 2008-10-15 17:35:14 +0000 (Wed, 15 Oct 2008)
Log Message:
-----------
more tests and bug-fixes
Modified Paths:
--------------
branches/rewrite/pywikibot/site.py
branches/rewrite/pywikibot/tests/site_tests.py
Modified: branches/rewrite/pywikibot/site.py
===================================================================
--- branches/rewrite/pywikibot/site.py 2008-10-15 17:28:48 UTC (rev 5976)
+++ branches/rewrite/pywikibot/site.py 2008-10-15 17:35:14 UTC (rev 5977)
@@ -1431,7 +1431,7 @@
"blocks: starttime must be before endtime with reverse=True")
return
else:
- if endtime < starttime:
+ if endtime > starttime:
logger.error(
"blocks: endtime must be before starttime with reverse=False")
return
@@ -1443,7 +1443,7 @@
if endtime:
bkgen.request["bkend"] = endtime
if reverse:
- bkgen.request["bkdir"] = newer
+ bkgen.request["bkdir"] = "newer"
if blockids:
bkgen.request["bkids"] = blockids
if users:
Modified: branches/rewrite/pywikibot/tests/site_tests.py
===================================================================
--- branches/rewrite/pywikibot/tests/site_tests.py 2008-10-15 17:28:48 UTC (rev 5976)
+++ branches/rewrite/pywikibot/tests/site_tests.py 2008-10-15 17:35:14 UTC (rev 5977)
@@ -401,15 +401,77 @@
self.assertTrue(len(ai) <= 10)
self.assertTrue(all(isinstance(image, pywikibot.ImagePage)
for image in ai))
+ for impage in mysite.allimages(start="Ba", limit=5):
+ self.assertType(impage, pywikibot.ImagePage)
+ self.assertTrue(mysite.page_exists(impage))
+ self.assertTrue(impage.title(withNamespace=False) >= "Ba")
+ # Bug # 15985
+## for impage in mysite.allimages(start="Da", reverse=True, limit=5):
+## self.assertType(impage, pywikibot.ImagePage)
+## self.assertTrue(mysite.page_exists(impage))
+## self.assertTrue(impage.title() <= "Da")
+ for impage in mysite.allimages(prefix="Ch", limit=5):
+ self.assertType(impage, pywikibot.ImagePage)
+ self.assertTrue(mysite.page_exists(impage))
+ self.assertTrue(impage.title(withNamespace=False).startswith("Ch"))
+ for impage in mysite.allimages(minsize=100, limit=5):
+ self.assertType(impage, pywikibot.ImagePage)
+ self.assertTrue(mysite.page_exists(impage))
+ self.assertTrue(len(impage.text) >= 100)
+ for impage in mysite.allimages(maxsize=200, limit=5):
+ self.assertType(impage, pywikibot.ImagePage)
+ self.assertTrue(mysite.page_exists(impage))
+ self.assertTrue(len(impage.text) <= 200)
def testBlocks(self):
"""Test the site.blocks() method"""
+ props = ("id", "by", "timestamp", "expiry", "reason")
bl = list(mysite.blocks(limit=10))
self.assertTrue(len(bl) <= 10)
- self.assertTrue(all(isinstance(block, dict)
- for block in bl))
+ for block in bl:
+ self.assertType(block, dict)
+ for prop in props:
+ self.assertTrue(prop in block)
+ # timestamps should be in descending order
+ timestamps = [block['timestamp'] for block in bl]
+ for t in xrange(1, len(timestamps)):
+ self.assertTrue(timestamps[t] < timestamps[t-1])
+ b2 = list(mysite.blocks(limit=10, reverse=True))
+ self.assertTrue(len(b2) <= 10)
+ for block in b2:
+ self.assertType(block, dict)
+ for prop in props:
+ self.assertTrue(prop in block)
+ # timestamps should be in ascending order
+ timestamps = [block['timestamp'] for block in b2]
+ for t in xrange(1, len(timestamps)):
+ self.assertTrue(timestamps[t] > timestamps[t-1])
+
+ for block in mysite.blocks(starttime="20080101000001", limit=5):
+ self.assertType(block, dict)
+ for prop in props:
+ self.assertTrue(prop in block)
+ for block in mysite.blocks(endtime="20080131235959", limit=5):
+ self.assertType(block, dict)
+ for prop in props:
+ self.assertTrue(prop in block)
+ for block in mysite.blocks(starttime="20080202000001",
+ endtime="20080202235959",
+ reverse=True, limit=5):
+ self.assertType(block, dict)
+ for prop in props:
+ self.assertTrue(prop in block)
+ for block in mysite.blocks(starttime="20080203235959",
+ endtime="20080203000001",
+ limit=5):
+ self.assertType(block, dict)
+ for prop in props:
+ self.assertTrue(prop in block)
+
+# TODO
+
def testExturlusage(self):
"""Test the site.exturlusage() method"""
@@ -418,6 +480,8 @@
self.assertTrue(len(eu) <= 10)
self.assertTrue(all(isinstance(link, pywikibot.Page)
for link in eu))
+ for link in mysite.exturlusage(url, namespaces=[2, 3], limit=5):
+ self.assertType(link, pywikibot.Page)
def testImageusage(self):
"""Test the site.imageusage() method"""
Revision: 5971
Author: a_engels
Date: 2008-10-15 07:18:52 +0000 (Wed, 15 Oct 2008)
Log Message:
-----------
1. Add a new option -back. If -back is added as an option, ONLY pages that do not have backlinks yet will be worked on.
2. When using -autonomous, the bot will now halt as soon as it finds a conflict, and not needlessly load more pages.
Modified Paths:
--------------
trunk/pywikipedia/interwiki.py
Modified: trunk/pywikipedia/interwiki.py
===================================================================
--- trunk/pywikipedia/interwiki.py 2008-10-15 07:16:05 UTC (rev 5970)
+++ trunk/pywikipedia/interwiki.py 2008-10-15 07:18:52 UTC (rev 5971)
@@ -169,6 +169,10 @@
you are sure you have first gotten the interwiki on the
starting page exactly right).
(note: without ending colon)
+
+ -back only work on pages that have no backlink from any other
+ language; if a backlink is found, all work on the page
+ will be halted.
The following arguments are only important for users who have accounts for
multiple languages, and specify on which sites the bot should modify pages:
@@ -462,6 +466,7 @@
rememberno = False
followinterwiki = True
minsubjects = config.interwiki_min_subjects
+ nobackonly = False
class Subject(object):
"""
@@ -493,6 +498,7 @@
self.problemfound = False
self.untranslated = None
self.hintsAsked = False
+ self.forcedStop = False
def getFoundDisambig(self, site):
"""
@@ -575,6 +581,15 @@
# If there are any, return them. Otherwise, nothing is in progress.
return self.pending
+ def makeForcedStop(self,counter):
+ """
+ Ends work on the page before the normal end.
+ """
+ for page in self.todo:
+ counter.minus(page.site())
+ self.todo = []
+ self.forcedStop = True
+
def addIfNew(self, page, counter, linkingPage):
"""
Adds the pagelink given to the todo list, but only if we didn't know
@@ -585,6 +600,13 @@
Returns True iff the page is new.
"""
+ if self.forcedStop:
+ return False
+ if globalvar.nobackonly:
+ if page == self.originPage:
+ wikipedia.output("%s has a backlink from %s."%(page,linkingPage))
+ self.makeForcedStop(counter)
+ return False
if self.foundIn.has_key(page):
# not new
self.foundIn[page].append(linkingPage)
@@ -809,6 +831,11 @@
if globalvar.untranslatedonly:
# Ignore the interwiki links.
iw = ()
+ elif globalvar.autonomous and page.site() in [p.site() for p in self.done if p != page and p.exists() and not p.isRedirectPage()]:
+ otherpage = [p for p in self.done if p.site() == page.site() and p != page and p.exists() and not p.isRedirectPage()][0]
+ wikipedia.output(u"Stopping work on %s because duplicate pages %s and %s are found"%(self.originPage.aslink(),otherpage.aslink(True),page.aslink(True)))
+ self.makeForcedStop(counter)
+ iw = ()
elif page.isEmpty() and not page.isCategory():
wikipedia.output(u"NOTE: %s is empty; ignoring it and its interwiki links" % page.aslink(True))
# Ignore the interwiki links
@@ -979,6 +1006,9 @@
be told to make another get request first."""
if not self.isDone():
raise "Bugcheck: finish called before done"
+ if self.forcedStop:
+ wikipedia.output("Stopping work on %s."%self.originPage)
+ return
if self.originPage.isRedirectPage():
return
if not self.untranslated and globalvar.untranslatedonly:
@@ -1677,6 +1707,8 @@
globalvar.minsubjects = int(arg[7:])
elif arg.startswith('-query:'):
globalvar.maxquerysize = int(arg[7:])
+ elif arg == '-back':
+ globalvar.nobackonly = True
else:
generator = genFactory.handleArg(arg)
if generator:
Revision: 5975
Author: filnik
Date: 2008-10-15 12:17:22 +0000 (Wed, 15 Oct 2008)
Log Message:
-----------
Little bugfix in the load-regex
Modified Paths:
--------------
trunk/pywikipedia/checkimages.py
Modified: trunk/pywikipedia/checkimages.py
===================================================================
--- trunk/pywikipedia/checkimages.py 2008-10-15 08:14:30 UTC (rev 5974)
+++ trunk/pywikipedia/checkimages.py 2008-10-15 12:17:22 UTC (rev 5975)
@@ -1105,7 +1105,7 @@
load_2 = True
# I search with a regex how many user have not the talk page
# and i put them in a list (i find it more easy and secure)
- regl = r"(\"|\')(.*?)\1(?:,\s+?|\])"
+ regl = r"(\"|\')(.*?)\1(?:,|\])"
pl = re.compile(regl, re.UNICODE)
for xl in pl.finditer(raw):
word = xl.group(2).replace('\\\\', '\\')
Feature Requests item #2168298, was opened at 2008-10-15 12:00
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603141&aid=2168298&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: interwiki
Group: None
Status: Open
Priority: 5
Private: No
Submitted By: Purodha B Blissenbach (purodha)
Assigned to: Nobody/Anonymous (nobody)
Summary: add -hint:latin to interwiki.py
Initial Comment:
interwiki.py already supports -hint:cyril so as to select all wikis of a familiy using the cyrillic script.
Adding -hint:latin only appears logical, an will be useful e.g. when dealing with proper names that
can most often be expected to be spellt alike in almost all languages using the Latin script.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603141&aid=2168298&group_…
I have some stuff I'd like to change, but I seem to have fully
forgotten my credentials... What's my user name, and where and when
would I have created a password for it?
--
André Engels, andreengels(a)gmail.com
Revision: 5974
Author: cosoleto
Date: 2008-10-15 08:14:30 +0000 (Wed, 15 Oct 2008)
Log Message:
-----------
Removed Paths:
-------------
trunk/pywikipedia/README.txt
Deleted: trunk/pywikipedia/README.txt
===================================================================
--- trunk/pywikipedia/README.txt 2008-10-15 08:12:42 UTC (rev 5973)
+++ trunk/pywikipedia/README.txt 2008-10-15 08:14:30 UTC (rev 5974)
@@ -1,5 +0,0 @@
-This is a Subversion repository; use the 'svnadmin' tool to examine
-it. Do not add, delete, or modify files here unless you know how
-to avoid corrupting the repository.
-
-Visit http://subversion.tigris.org/ for more information.