[Pywikipedia-l] [ pywikipediabot-Bugs-2064976 ] All pages soup problems

SourceForge.net noreply at sourceforge.net
Fri Jan 30 19:44:47 UTC 2009


Bugs item #2064976, was opened at 2008-08-21 10:44
Message generated for change (Comment added) made by russblau
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=2064976&group_id=93107

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: General
Group: None
>Status: Closed
>Resolution: Works For Me
Priority: 7
Private: No
Submitted By: Multichill (multichill)
Assigned to: Nobody/Anonymous (nobody)
Summary: All pages soup problems

Initial Comment:
While running python2.4 imageuncat.py -start:Image:Chironomidae

Working on Image:Cicada.ogg
Got category Category:Images transwikied by BetacommandBot
Working on Image:Cicada.png
Got category Category:Magicicada
Working on Image:Cicada0001.jpg
Got category Category:Cicadellidae
Traceback (most recent call last):
  File "/home/bot/pywikipedia/pagegenerators.py", line 755, in __iter__
    for page in self.wrapped_gen:
  File "/home/bot/pywikipedia/pagegenerators.py", line 688, in DuplicateFilterPageGenerator
    for page in generator:
  File "/home/bot/pywikipedia/pagegenerators.py", line 239, in AllpagesPageGenerator
    for page in site.allpages(start = start, namespace = namespace, includeredirects = includeredirects):
  File "/home/bot/pywikipedia/wikipedia.py", line 5169, in allpages
    for p in soup.api.query.allpages:
AttributeError: 'NoneType' object has no attribute 'query'
'NoneType' object has no attribute 'query'

Pywikipedia [http] trunk/pywikipedia (r5827, Aug 21 2008, 14:32:44)
Python 2.4.4 (#1, Jun 11 2007, 23:35:50) 
[GCC 3.3.3 (NetBSD nb3 20040520)]

Why are we using BeautifulSoup anyway? We dont need to screen-scrape the API. 

----------------------------------------------------------------------

>Comment By: Russell Blau (russblau)
Date: 2009-01-30 14:44

Message:
>pagegenerators.py -start:Image:Chironomidae
Checked for running processes. 1 processes currently running, including
the current process.
File:Chiropotes aequatorialis map.png
File:Chiropotes chiropotes map.png
File:Chiropotes irrorata map.png
(etc.)


----------------------------------------------------------------------

Comment By: Stig Meireles Johansen (stigmj)
Date: 2008-08-21 15:00

Message:
Logged In: YES 
user_id=2116333
Originator: NO

I did a quick hack myself before I saw this beautifulsoup-version. I did
it with json and simplejson ... I don't know which method is better, but
this beautifulsoup-version is prettier.. :)

----------------------------------------------------------------------

Comment By: Jitse Niesen (jitseniesen)
Date: 2008-08-21 12:19

Message:
Logged In: YES 
user_id=194734
Originator: NO

I found something strange in allpages() which might have caused the
problem and fixed it a minute ago in r5829. However, I'm not sure that this
did cause the problem, so I'm leaving the bug open.

BeautifulSoup is used to parse the XML that the API provides. Do you think
it's the wrong tool (I honestly don't know)?

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=2064976&group_id=93107



More information about the Pywikipedia-l mailing list