Bugs item #1830920, was opened at 2007-11-13 10:10
Message generated for change (Comment added) made by rotemliss
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1830920&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Closed
Resolution: Fixed
Priority: 5
Private: No
Submitted By: Nobody/Anonymous (nobody)
Assigned to: Nobody/Anonymous (nobody)
Summary: "replace.py -nocase -excepttext" Unicode AttributeError
Initial Comment:
"python replace.py -start:! -nocase -excepttext:42 foo bar" crashes :
Checked for running processes. 2 processes currently running, including the current process.
Retrieving Allpages special page for wikipedia:fr from %21, namespace 0
Getting 50 pages from wikipedia:fr...
Traceback (most recent call last):
File "replace.py", line 556, in <module>
main()
File "replace.py", line 552, in main
bot.run()
File "replace.py", line 299, in run
if self.isTextExcepted(original_text):
File "replace.py", line 259, in isTextExcepted
if exc.search(original_text):
AttributeError: 'unicode' object has no attribute 'search'
----------------------------------------------------------------------
Comment By: Rotem Liss (rotemliss)
Date: 2007-12-17 22:31
Message:
Logged In: YES
user_id=1327030
Originator: NO
Reverted in r4727. This should be thought of again, as exceptions seems to
include both regexps and strings.
----------------------------------------------------------------------
Comment By: Rotem Liss (rotemliss)
Date: 2007-11-15 16:28
Message:
Logged In: YES
user_id=1327030
Originator: NO
Fixed in r4557.
----------------------------------------------------------------------
Comment By: Nobody/Anonymous (nobody)
Date: 2007-11-14 20:46
Message:
Logged In: NO
I would add that a call to replace.py with -nocase and one or more of the
(-excepttext, -excepttitle and -excepttitle parameters) will cause the same
behavior.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1830920&group_…
Bugs item #1852163, was opened at 2007-12-17 10:15
Message generated for change (Comment added) made by rotemliss
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1852163&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Resolution: None
Priority: 7
Private: No
Submitted By: John Vandenberg (zeroj)
Assigned to: Nobody/Anonymous (nobody)
Summary: replace -except[text|title] syntax
Initial Comment:
The syntax in replace.py is broken in the latest revision.
Calling with -excepttext results in:
Traceback (most recent call last):
File "replace.py", line 602, in ?
main()
File "replace.py", line 598, in main
bot.run()
File "replace.py", line 329, in run
if self.isTextExcepted(original_text):
File "replace.py", line 288, in isTextExcepted
if exc.find(original_text) > -1:
AttributeError: find
----------------------------------------------------------------------
Comment By: Rotem Liss (rotemliss)
Date: 2007-12-17 22:29
Message:
Logged In: YES
user_id=1327030
Originator: NO
Fixed in r4727.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1852163&group_…
Revision: 4727
Author: rotem
Date: 2007-12-17 20:27:11 +0000 (Mon, 17 Dec 2007)
Log Message:
-----------
Reverting part of r4557: self.exceptions seems to have both strings and regexps. Should treat it differently.
Modified Paths:
--------------
trunk/pywikipedia/replace.py
Modified: trunk/pywikipedia/replace.py
===================================================================
--- trunk/pywikipedia/replace.py 2007-12-17 20:19:17 UTC (rev 4726)
+++ trunk/pywikipedia/replace.py 2007-12-17 20:27:11 UTC (rev 4727)
@@ -212,14 +212,14 @@
def isTitleExcepted(self, title):
if self.exceptions.has_key('title'):
for exc in self.exceptions['title']:
- if exc.find(title) > -1:
+ if exc.search(title):
return True
return False
def isTextExcepted(self, text):
if self.exceptions.has_key('text-contains'):
for exc in self.exceptions['text-contains']:
- if exc.find(text) > -1:
+ if exc.search(text):
return True
return False
@@ -274,7 +274,7 @@
"""
if self.exceptions.has_key('title'):
for exc in self.exceptions['title']:
- if exc.find(title) > -1:
+ if exc.search(title):
return True
return False
@@ -284,7 +284,7 @@
"""
if self.exceptions.has_key('text-contains'):
for exc in self.exceptions['text-contains']:
- if exc.find(original_text) > -1:
+ if exc.search(original_text):
return True
return False
Revision: 4725
Author: rotem
Date: 2007-12-17 20:08:58 +0000 (Mon, 17 Dec 2007)
Log Message:
-----------
Not anymore.
Modified Paths:
--------------
trunk/pywikipedia/catlib.py
Modified: trunk/pywikipedia/catlib.py
===================================================================
--- trunk/pywikipedia/catlib.py 2007-12-17 16:23:24 UTC (rev 4724)
+++ trunk/pywikipedia/catlib.py 2007-12-17 20:08:58 UTC (rev 4725)
@@ -89,9 +89,8 @@
Cache results of _parseCategory for a second call.
If recurse is a bool, and value is True, then recursively retrieves
- contents of all subcategories without limit. (WARNING: can lead to
- infinite loops!) If recurse is an int, recursively retrieves
- contents of subcategories to that depth only.
+ contents of all subcategories without limit. If recurse is an int,
+ recursively retrieves contents of subcategories to that depth only.
Other parameters are analogous to _parseCategory(). If purge is True,
cached results will be discarded. If startFrom is used, nothing
Bugs item #1851933, was opened at 2007-12-16 12:49
Message generated for change (Comment added) made by russblau
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1851933&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: category
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Bernhard Mayr (falk_steinhauer)
Assigned to: Nobody/Anonymous (nobody)
Summary: Bug in catlib.py (Problems with caching-mechanism)
Initial Comment:
I detected a bug in catlib.py that is due to the caching-functionality.
Line 132 (if not page in cache:) causes problems.
If an article is part of [[Category:A]] and [[Category:B]] and a script iterates over both categories, the article is not yielded by catB.articles() because it was cached during an earlier call of catA.articles().
It would be better if the call of catB.articles() really yields all articles in this category, because sometimes the plain names of the articles are of interest.
I did not fix the bug by myself, because I do not understand the caching-mechanism by now.
My view of the sources is on revision 4720 (updated today).
----------------------------------------------------------------------
Comment By: Russell Blau (russblau)
Date: 2007-12-17 08:36
Message:
Logged In: YES
user_id=855050
Originator: NO
Fixed in r4722
----------------------------------------------------------------------
Comment By: Rotem Liss (rotemliss)
Date: 2007-12-16 13:23
Message:
Logged In: YES
user_id=1327030
Originator: NO
Python seems to have some weird feature (or bug?) about default unused
parameters: it keeps them for the next run. Will fix that soon.
----------------------------------------------------------------------
Comment By: Bernhard Mayr (falk_steinhauer)
Date: 2007-12-16 13:08
Message:
Logged In: YES
user_id=1810075
Originator: YES
To solve my problem I put line 134 of catlib.py (yield ARTICLE, page) out
of the if-block of line 132 (if not page in cache:).
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1851933&group_…
Revision: 4722
Author: russblau
Date: 2007-12-17 13:35:43 +0000 (Mon, 17 Dec 2007)
Log Message:
-----------
Fix bug #1851933 (see http://www.python.org/doc/faq/general/#why-are-default-values-shared-betwee…)
Modified Paths:
--------------
trunk/pywikipedia/catlib.py
Modified: trunk/pywikipedia/catlib.py
===================================================================
--- trunk/pywikipedia/catlib.py 2007-12-17 11:58:19 UTC (rev 4721)
+++ trunk/pywikipedia/catlib.py 2007-12-17 13:35:43 UTC (rev 4722)
@@ -84,7 +84,7 @@
return '[[%s]]' % titleWithSortKey
def _getContentsAndSupercats(self, recurse=False, purge=False,
- startFrom=None, cache=[]):
+ startFrom=None, cache=None):
"""
Cache results of _parseCategory for a second call.
@@ -99,6 +99,8 @@
This should not be used outside of this module.
"""
+ if cache is None:
+ cache = []
if purge:
self.completelyCached = False
if recurse:
@@ -120,7 +122,7 @@
# this method recursively; therefore, do not cache
# them again
for item in subcat._getContentsAndSupercats(newrecurse,
- purge):
+ purge, cache=cache):
if item[0] != SUPERCATEGORY:
yield item
for supercat in self.supercatCache:
@@ -141,8 +143,8 @@
# contents of subcategory are cached by calling
# this method recursively; therefore, do not cache
# them again
- for item in page._getContentsAndSupercats(newrecurse,
- purge):
+ for item in page._getContentsAndSupercats(
+ newrecurse, purge, cache=cache):
if item[0] != SUPERCATEGORY:
yield item
elif tag == SUPERCATEGORY:
Bugs item #1852276, was opened at 2007-12-17 12:50
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1852276&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Purodha B Blissenbach (purodha)
Assigned to: Nobody/Anonymous (nobody)
Summary: Endless loop in weblinkchecker.py
Initial Comment:
$ python weblinkchecker.py -start:00f -family:wikipedia -lang:ksh -v
Checked for running processes. 1 processes currently running, including the current process.
Pywikipediabot (r4720 (wikipedia.py), Dec 15 2007, 18:57:27)
Python 2.4.4 (#2, Aug 16 2007, 00:34:54)
[GCC 4.1.3 20070812 (prerelease) (Debian 4.1.2-15)]
Retrieving Allpages special page for wikipedia:ksh from 00f, namespace 0
Retrieving Allpages special page for wikipedia:ksh from 00f, namespace 0
Retrieving Allpages special page for wikipedia:ksh from 00f, namespace 0
... ad infinitum ...
This may be related to bug #1852173 in this tracker.
Note, a page "00f" does not exist. I was expecting to start at the next available page, which, as a matter of fact, does exists.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1852276&group_…