Bugs item #2929809, was opened at 2010-01-11 14:24
Message generated for change (Tracker Item Submitted) made by basilicofresco
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=292980…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: General
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Davide Bolsi (basilicofresco)
Assigned to: Nobody/Anonymous (nobody)
Summary: 'NoneType' object has no attribute 'strip'
Initial Comment:
I was getting a strange error with replace.py and some large file dumps, so I did some
testing...
Well, I discovered that replace.py halts while loading some pages: eg. "Technical
Architecture Group" on en.wikipedia, but I got the same error also with a page on
it.wikipedia.
It halts on the very same page both with dump file and with direct page loading. This
error is particular annoying because for example I'm not able to full scan the whole
dump.
Examples:
1) direct page loading
C:\pywikipedia>replace.py -lang:en -page:"Technical Architecture Group"
"a" "b"
Getting 1 pages from wikipedia:en...
Traceback (most recent call last):
File "C:\pywikipedia\pagegenerators.py", line 860, in __iter__
for loaded_page in self.preload(somePages):
File "C:\pywikipedia\pagegenerators.py", line 879, in preload
wikipedia.getall(site, pagesThisSite)
File "C:\pywikipedia\wikipedia.py", line 4159, in getall
_GetAll(site, pages, throttle, force).run()
File "C:\pywikipedia\wikipedia.py", line 3842, in run
xml.sax.parseString(data, handler)
File "C:\Python26\lib\xml\sax\__init__.py", line 49, in parseString
parser.parse(inpsrc)
File "C:\Python26\lib\xml\sax\expatreader.py", line 107, in parse
xmlreader.IncrementalParser.parse(self, source)
File "C:\Python26\lib\xml\sax\xmlreader.py", line 123, in parse
self.feed(buffer)
File "C:\Python26\lib\xml\sax\expatreader.py", line 207, in feed
self._parser.Parse(data, isFinal)
File "C:\Python26\lib\xml\sax\expatreader.py", line 304, in end_element
self._cont_handler.endElement(name)
File "C:\pywikipedia\xmlreader.py", line 182, in endElement
text, self.username,
AttributeError: MediaWikiXmlHandler instance has no attribute 'username'
MediaWikiXmlHandler instance has no attribute 'username'
2) dump file (on this dump "Successions of Philosophers" immediately precedes
"Technical Architecture Group")
C:\pywikipedia>replace.py -xml:enwiki-20091128-pages-articles.xml -lang:en
-xmlstart:"Successions of
Philosophers" "a" "b"
Reading XML dump...
Getting 1 pages from wikipedia:en...
>> Successions of Philosophers <<<
[...cut......cut......cut...]
Do you want to accept these changes? ([y]es, [N]o, [e]dit, open in [b]rowser, [a]ll,
[q]uit) n
Traceback (most recent call last):
File "C:\pywikipedia\pagegenerators.py", line 847, in __iter__
for page in self.wrapped_gen:
File "C:\pywikipedia\pagegenerators.py", line 779, in
DuplicateFilterPageGenerator
for page in generator:
File "C:\pywikipedia\replace.py", line 218, in __iter__
for entry in self.parser:
File "C:\pywikipedia\xmlreader.py", line 295, in new_parse
for rev in self._parse(event, elem):
File "C:\pywikipedia\xmlreader.py", line 304, in _parse_only_latest
yield self._create_revision(revision)
File "C:\pywikipedia\xmlreader.py", line 341, in _create_revision
redirect=self.isredirect
File "C:\pywikipedia\xmlreader.py", line 64, in __init__
self.username = username.strip()
AttributeError: 'NoneType' object has no attribute 'strip'
'NoneType' object has no attribute 'strip'
Thanks in advance!
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=292980…