Bugs item #2035835, was opened at 2008-08-02 10:02 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603138&aid=2035835...
Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Purodha B Blissenbach (purodha) Assigned to: Nobody/Anonymous (nobody) Summary: SaxParseBug caused error invalid literal for int()
Initial Comment: I got an error message an trace dump from interwiki.py which afterwords continues gracefully. Here are the messages:
python /home/purodha/pywikipedia/interwiki.py -v -initialredirect -new:3 Checked for running processes. 1 processes currently running, including the current process. Pywikipediabot (r5776 (wikipedia.py), Aug 01 2008, 15:39:04) Python 2.5.2 (r252:60911, May 28 2008, 19:19:25) [GCC 4.2.4 (Debian 4.2.4-1)] Retrieving mediawiki messages from Special:Allmessages WARNING: No character set found. NOTE: Number of pages queued is 0, trying to add 60 more. Getting 3 pages from wikipedia:ksh...
-- some lines skipped --
Getting 1 pages from wikipedia:am... ERROR: SaxParseBug caused error invalid literal for int() with base 10: 'NS_CATEGORY'. Dump SaxParseBug_wikipedia_am__Sat_Aug__2_09-54-57_2008.dump created. Traceback (most recent call last): File "/home/purodha/pywikipedia/pagegenerators.py", line 768, in __iter__ for loaded_page in self.preload(somePages): File "/home/purodha/pywikipedia/pagegenerators.py", line 785, in preload wikipedia.getall(site, pagesThisSite) File "/home/purodha/pywikipedia/wikipedia.py", line 2950, in getall _GetAll(site, pages, throttle, force).run() File "/home/purodha/pywikipedia/wikipedia.py", line 2798, in run xml.sax.parseString(data, handler) File "/usr/lib/python2.5/site-packages/_xmlplus/sax/__init__.py", line 47, in parseString parser.parse(inpsrc) File "/usr/lib/python2.5/site-packages/_xmlplus/sax/expatreader.py", line 109, in parse xmlreader.IncrementalParser.parse(self, source) File "/usr/lib/python2.5/site-packages/_xmlplus/sax/xmlreader.py", line 123, in parse self.feed(buffer) File "/usr/lib/python2.5/site-packages/_xmlplus/sax/expatreader.py", line 216, in feed self._parser.Parse(data, isFinal) File "/usr/lib/python2.5/site-packages/_xmlplus/sax/expatreader.py", line 312, in start_element self._cont_handler.startElement(name, AttributesImpl(attrs)) File "/home/purodha/pywikipedia/xmlreader.py", line 150, in startElement self.namespaceid = int(attrs['key']) ValueError: invalid literal for int() with base 10: 'NS_CATEGORY' invalid literal for int() with base 10: 'NS_CATEGORY' Getting page [[am:????]]
etc.
----------------------------------------------------------------------
You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603138&aid=2035835...