Bugs item #2929809, was opened at 2010-01-11 14:24 Message generated for change (Tracker Item Submitted) made by basilicofresco You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603138&aid=2929809...
Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: General Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Davide Bolsi (basilicofresco) Assigned to: Nobody/Anonymous (nobody) Summary: 'NoneType' object has no attribute 'strip'
Initial Comment: I was getting a strange error with replace.py and some large file dumps, so I did some testing... Well, I discovered that replace.py halts while loading some pages: eg. "Technical Architecture Group" on en.wikipedia, but I got the same error also with a page on it.wikipedia. It halts on the very same page both with dump file and with direct page loading. This error is particular annoying because for example I'm not able to full scan the whole dump.
Examples:
1) direct page loading
C:\pywikipedia>replace.py -lang:en -page:"Technical Architecture Group" "a" "b" Getting 1 pages from wikipedia:en... Traceback (most recent call last): File "C:\pywikipedia\pagegenerators.py", line 860, in __iter__ for loaded_page in self.preload(somePages): File "C:\pywikipedia\pagegenerators.py", line 879, in preload wikipedia.getall(site, pagesThisSite) File "C:\pywikipedia\wikipedia.py", line 4159, in getall _GetAll(site, pages, throttle, force).run() File "C:\pywikipedia\wikipedia.py", line 3842, in run xml.sax.parseString(data, handler) File "C:\Python26\lib\xml\sax__init__.py", line 49, in parseString parser.parse(inpsrc) File "C:\Python26\lib\xml\sax\expatreader.py", line 107, in parse xmlreader.IncrementalParser.parse(self, source) File "C:\Python26\lib\xml\sax\xmlreader.py", line 123, in parse self.feed(buffer) File "C:\Python26\lib\xml\sax\expatreader.py", line 207, in feed self._parser.Parse(data, isFinal) File "C:\Python26\lib\xml\sax\expatreader.py", line 304, in end_element self._cont_handler.endElement(name) File "C:\pywikipedia\xmlreader.py", line 182, in endElement text, self.username, AttributeError: MediaWikiXmlHandler instance has no attribute 'username' MediaWikiXmlHandler instance has no attribute 'username'
2) dump file (on this dump "Successions of Philosophers" immediately precedes "Technical Architecture Group")
C:\pywikipedia>replace.py -xml:enwiki-20091128-pages-articles.xml -lang:en -xmlstart:"Successions of Philosophers" "a" "b" Reading XML dump... Getting 1 pages from wikipedia:en...
Successions of Philosophers <<<
[...cut......cut......cut...]
Do you want to accept these changes? ([y]es, [N]o, [e]dit, open in [b]rowser, [a]ll, [q]uit) n Traceback (most recent call last): File "C:\pywikipedia\pagegenerators.py", line 847, in __iter__ for page in self.wrapped_gen: File "C:\pywikipedia\pagegenerators.py", line 779, in DuplicateFilterPageGenerator for page in generator: File "C:\pywikipedia\replace.py", line 218, in __iter__ for entry in self.parser: File "C:\pywikipedia\xmlreader.py", line 295, in new_parse for rev in self._parse(event, elem): File "C:\pywikipedia\xmlreader.py", line 304, in _parse_only_latest yield self._create_revision(revision) File "C:\pywikipedia\xmlreader.py", line 341, in _create_revision redirect=self.isredirect File "C:\pywikipedia\xmlreader.py", line 64, in __init__ self.username = username.strip() AttributeError: 'NoneType' object has no attribute 'strip' 'NoneType' object has no attribute 'strip'
Thanks in advance!
----------------------------------------------------------------------
You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603138&aid=2929809...