Revision: 5834
Author: russblau
Date: 2008-08-22 13:51:08 +0000 (Fri, 22 Aug 2008)
Log Message:
-----------
Allow overriding sys.argv in handleArgs()
Modified Paths:
--------------
trunk/pywikipedia/wikipedia.py
Modified: trunk/pywikipedia/wikipedia.py
===================================================================
--- trunk/pywikipedia/wikipedia.py 2008-08-22 13:42:58 UTC (rev 5833)
+++ trunk/pywikipedia/wikipedia.py 2008-08-22 13:51:08 UTC (rev 5834)
@@ -5964,17 +5964,21 @@
# I don't know how non-Western Windows versions behave.
return unicode(arg, config.console_encoding)
-def handleArgs():
+def handleArgs(*args):
"""Handle standard command line arguments, return the rest as a list.
Takes the commandline arguments, converts them to Unicode, processes all
global parameters such as -lang or -log. Returns a list of all arguments
that are not global. This makes sure that global arguments are applied
first, regardless of the order in which the arguments were given.
+
+ args may be passed as an argument, thereby overriding sys.argv
+
"""
global default_code, default_family, verbose
# get commandline arguments
- args = sys.argv
+ if not args:
+ args = sys.argv
# get the name of the module calling this function. This is
# required because the -help option loads the module's docstring and because
# the module name will be used for the filename of the log.
Bugs item #2065095, was opened at 2008-08-21 15:45
Message generated for change (Comment added) made by purodha
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=2065095&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: interwiki
Group: None
>Status: Closed
>Resolution: Fixed
Priority: 7
Private: No
Submitted By: Purodha B Blissenbach (purodha)
Assigned to: Nobody/Anonymous (nobody)
Summary: interwiki - Bogus "Bot blocked" ?
Initial Comment:
"Bot blocked" message but no block found.
python /home/purodha/pywikipedia/interwiki.py -v -putthrottle:1 -initialredirect -new:2
Checked for running processes. 1 processes currently running, including the current process.
Pywikipediabot (r5821 (wikipedia.py), Aug 20 2008, 15:32:53)
Python 2.5.2 (r252:60911, Aug 14 2008, 13:31:58)
[GCC 4.3.1]
Retrieving mediawiki messages from Special:Allmessages
WARNING: No character set found.
NOTE: Number of pages queued is 0, trying to add 60 more.
Getting 2 pages from wikipedia:ksh...
[[Betty Hutton]]: [[ksh:Betty Hutton]] gives new interwiki [[cy:Betty Hutton]]
--- few lines skipped ---
Updating links on page [[da:Betty Hutton]].
Changes to be made: Tilfjer: [[ksh:Betty Hutton]]
+ [[ksh:Betty Hutton]]
NOTE: Updating live wiki...
Getting information for site wikipedia:da
WARNING: Your account on wikipedia:da is blocked. Editing using this account will stop the run.
Getting information for site wikipedia:da
Getting information for site wikipedia:da
Dump ksh (wikipedia) saved
Traceback (most recent call last):
File "/home/purodha/pywikipedia/interwiki.py", line 1760, in <module>
bot.run()
File "/home/purodha/pywikipedia/interwiki.py", line 1497, in run
self.queryStep()
File "/home/purodha/pywikipedia/interwiki.py", line 1476, in queryStep
subj.finish(self)
File "/home/purodha/pywikipedia/interwiki.py", line 1057, in finish
if self.replaceLinks(page, new, bot):
File "/home/purodha/pywikipedia/interwiki.py", line 1215, in replaceLinks
status, reason, data = page.put(newtext, comment = wikipedia.translate(page.site().lang, msg)[0] + mods)
File "/home/purodha/pywikipedia/wikipedia.py", line 1264, in put
self.site().checkBlocks(sysop = sysop)
File "/home/purodha/pywikipedia/wikipedia.py", line 4191, in checkBlocks
raise UserBlocked('User is blocked in site %s' % self)
wikipedia.UserBlocked: User is blocked in site wikipedia:da
I've inspected all the logs on dawiki, there is n block of user:Purbo_T or toolserver.org (91.198.174.203) to be found.
----------------------------------------------------------------------
>Comment By: Purodha B Blissenbach (purodha)
Date: 2008-08-22 10:50
Message:
Logged In: YES
user_id=46450
Originator: YES
For whichever reason, since 25 minutes, the bot is running again. No
software updated though, so the reason was likely in the Danisch Wikipedia,
yet only not to be determined by log file inspection.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=2065095&group_…
Revision: 5832
Author: nicdumz
Date: 2008-08-22 05:52:45 +0000 (Fri, 22 Aug 2008)
Log Message:
-----------
Print the good url in case of error in getUrl when no_hostname=True
Modified Paths:
--------------
trunk/pywikipedia/wikipedia.py
Modified: trunk/pywikipedia/wikipedia.py
===================================================================
--- trunk/pywikipedia/wikipedia.py 2008-08-21 20:52:15 UTC (rev 5831)
+++ trunk/pywikipedia/wikipedia.py 2008-08-22 05:52:45 UTC (rev 5832)
@@ -4468,9 +4468,9 @@
# We assume that the server is down. Wait some time, then try again.
output(u"%s" % e)
output(u"""\
-WARNING: Could not open '%s://%s%s'. Maybe the server or
+WARNING: Could not open '%s'. Maybe the server or
your connection is down. Retrying in %i minutes..."""
- % (self.protocol(), self.hostname(), path,
+ % (url,
retry_idle_time))
time.sleep(retry_idle_time * 60)
# Next time wait longer, but not longer than half an hour
Revision: 5830
Author: multichill
Date: 2008-08-21 19:49:47 +0000 (Thu, 21 Aug 2008)
Log Message:
-----------
category_blacklist at http://commons.wikimedia.org/wiki/User:Multichill/Category_blacklist
Modified Paths:
--------------
trunk/pywikipedia/imagerecat.py
Modified: trunk/pywikipedia/imagerecat.py
===================================================================
--- trunk/pywikipedia/imagerecat.py 2008-08-21 16:06:39 UTC (rev 5829)
+++ trunk/pywikipedia/imagerecat.py 2008-08-21 19:49:47 UTC (rev 5830)
@@ -20,20 +20,24 @@
import pagegenerators, StringIO
import socket
-category_blacklist = [u'Hidden categories',
- u'Stub pictures']
-
+category_blacklist = []
countries = []
-def getCountries():
+def initLists():
'''
- Get the list of countries from Commons.
+ Get the list of countries & the blacklist from Commons.
'''
- result = []
+ global category_blacklist
+ global countries
+
+ blacklistPage = wikipedia.Page(wikipedia.getSite(), u'User:Multichill/Category_blacklist')
+ for cat in blacklistPage.linkedPages():
+ category_blacklist.append(cat.titleWithoutNamespace())
+
countryPage = wikipedia.Page(wikipedia.getSite(), u'User:Multichill/Countries')
for country in countryPage.linkedPages():
- result.append(country.titleWithoutNamespace())
- return result
+ countries.append(country.titleWithoutNamespace())
+ return
def categorizeImages(generator, onlyfilter):
'''
@@ -241,8 +245,8 @@
generator = genFactory.handleArg(arg)
if not generator:
generator = pagegenerators.CategorizedPageGenerator(catlib.Category(site, u'Category:Media needing categories'), recurse=True)
- global countries
- countries = getCountries()
+
+ initLists()
categorizeImages(generator, onlyfilter)
wikipedia.output(u'All done')
Bugs item #2064976, was opened at 2008-08-21 10:44
Message generated for change (Comment added) made by stigmj
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=2064976&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: General
Group: None
Status: Open
Resolution: None
Priority: 7
Private: No
Submitted By: Multichill (multichill)
Assigned to: Nobody/Anonymous (nobody)
Summary: All pages soup problems
Initial Comment:
While running python2.4 imageuncat.py -start:Image:Chironomidae
Working on Image:Cicada.ogg
Got category Category:Images transwikied by BetacommandBot
Working on Image:Cicada.png
Got category Category:Magicicada
Working on Image:Cicada0001.jpg
Got category Category:Cicadellidae
Traceback (most recent call last):
File "/home/bot/pywikipedia/pagegenerators.py", line 755, in __iter__
for page in self.wrapped_gen:
File "/home/bot/pywikipedia/pagegenerators.py", line 688, in DuplicateFilterPageGenerator
for page in generator:
File "/home/bot/pywikipedia/pagegenerators.py", line 239, in AllpagesPageGenerator
for page in site.allpages(start = start, namespace = namespace, includeredirects = includeredirects):
File "/home/bot/pywikipedia/wikipedia.py", line 5169, in allpages
for p in soup.api.query.allpages:
AttributeError: 'NoneType' object has no attribute 'query'
'NoneType' object has no attribute 'query'
Pywikipedia [http] trunk/pywikipedia (r5827, Aug 21 2008, 14:32:44)
Python 2.4.4 (#1, Jun 11 2007, 23:35:50)
[GCC 3.3.3 (NetBSD nb3 20040520)]
Why are we using BeautifulSoup anyway? We dont need to screen-scrape the API.
----------------------------------------------------------------------
Comment By: Stig Meireles Johansen (stigmj)
Date: 2008-08-21 15:00
Message:
Logged In: YES
user_id=2116333
Originator: NO
I did a quick hack myself before I saw this beautifulsoup-version. I did
it with json and simplejson ... I don't know which method is better, but
this beautifulsoup-version is prettier.. :)
----------------------------------------------------------------------
Comment By: Jitse Niesen (jitseniesen)
Date: 2008-08-21 12:19
Message:
Logged In: YES
user_id=194734
Originator: NO
I found something strange in allpages() which might have caused the
problem and fixed it a minute ago in r5829. However, I'm not sure that this
did cause the problem, so I'm leaving the bug open.
BeautifulSoup is used to parse the XML that the API provides. Do you think
it's the wrong tool (I honestly don't know)?
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=2064976&group_…
Bugs item #2064976, was opened at 2008-08-21 15:44
Message generated for change (Comment added) made by jitseniesen
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=2064976&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: General
Group: None
Status: Open
Resolution: None
Priority: 7
Private: No
Submitted By: Multichill (multichill)
Assigned to: Nobody/Anonymous (nobody)
Summary: All pages soup problems
Initial Comment:
While running python2.4 imageuncat.py -start:Image:Chironomidae
Working on Image:Cicada.ogg
Got category Category:Images transwikied by BetacommandBot
Working on Image:Cicada.png
Got category Category:Magicicada
Working on Image:Cicada0001.jpg
Got category Category:Cicadellidae
Traceback (most recent call last):
File "/home/bot/pywikipedia/pagegenerators.py", line 755, in __iter__
for page in self.wrapped_gen:
File "/home/bot/pywikipedia/pagegenerators.py", line 688, in DuplicateFilterPageGenerator
for page in generator:
File "/home/bot/pywikipedia/pagegenerators.py", line 239, in AllpagesPageGenerator
for page in site.allpages(start = start, namespace = namespace, includeredirects = includeredirects):
File "/home/bot/pywikipedia/wikipedia.py", line 5169, in allpages
for p in soup.api.query.allpages:
AttributeError: 'NoneType' object has no attribute 'query'
'NoneType' object has no attribute 'query'
Pywikipedia [http] trunk/pywikipedia (r5827, Aug 21 2008, 14:32:44)
Python 2.4.4 (#1, Jun 11 2007, 23:35:50)
[GCC 3.3.3 (NetBSD nb3 20040520)]
Why are we using BeautifulSoup anyway? We dont need to screen-scrape the API.
----------------------------------------------------------------------
>Comment By: Jitse Niesen (jitseniesen)
Date: 2008-08-21 17:19
Message:
Logged In: YES
user_id=194734
Originator: NO
I found something strange in allpages() which might have caused the
problem and fixed it a minute ago in r5829. However, I'm not sure that this
did cause the problem, so I'm leaving the bug open.
BeautifulSoup is used to parse the XML that the API provides. Do you think
it's the wrong tool (I honestly don't know)?
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=2064976&group_…