Hi,
unfortunately en.planet updates are still being stuck.
https://bugzilla.wikimedia.org/show_bug.cgi?id=45806
The issue is in feedparser.py
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 32: ordinal not in range(128)
I would really appreciate help on this from somebody more familiar with Python/Django template/parsing feeds/ unicode problems.
Thanks ---
INFO:planet.runner:Loading cached data Traceback (most recent call last): File "/usr/bin/planet", line 138, in <module> splice.apply(doc.toxml('utf-8')) File "/usr/lib/pymodules/python2.7/planet/splice.py", line 118, in apply output_file = shell.run(template_file, doc) File "/usr/lib/pymodules/python2.7/planet/shell/__init__.py", line 66, in run module.run(template_resolved, doc, output_file, options) File "/usr/lib/pymodules/python2.7/planet/shell/tmpl.py", line 254, in run for key,value in template_info(doc).items(): File "/usr/lib/pymodules/python2.7/planet/shell/tmpl.py", line 193, in template_info data=feedparser.parse(source) File "/usr/lib/pymodules/python2.7/planet/vendor/feedparser.py", line 3525, in parse feedparser.feed(data) File "/usr/lib/pymodules/python2.7/planet/vendor/feedparser.py", line 1662, in feed sgmllib.SGMLParser.feed(self, data) File "/usr/lib/python2.7/sgmllib.py", line 104, in feed self.goahead(0) File "/usr/lib/python2.7/sgmllib.py", line 143, in goahead k = self.parse_endtag(i) File "/usr/lib/python2.7/sgmllib.py", line 320, in parse_endtag self.finish_endtag(tag) File "/usr/lib/python2.7/sgmllib.py", line 360, in finish_endtag self.unknown_endtag(tag) File "/usr/lib/pymodules/python2.7/planet/vendor/feedparser.py", line 569, in unknown_endtag method() File "/usr/lib/pymodules/python2.7/planet/vendor/feedparser.py", line 1512, in _end_content value = self.popContent('content') File "/usr/lib/pymodules/python2.7/planet/vendor/feedparser.py", line 849, in popContent value = self.pop(tag) File "/usr/lib/pymodules/python2.7/planet/vendor/feedparser.py", line 764, in pop mfresults = _parseMicroformats(output, self.baseuri, self.encoding) File "/usr/lib/pymodules/python2.7/planet/vendor/feedparser.py", line 2219, in _parseMicroformats p.vcard = p.findVCards(p.document) File "/usr/lib/pymodules/python2.7/planet/vendor/feedparser.py", line 2161, in findVCards sVCards += '\n'.join(arLines) + '\n' UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 32: ordinal not in range(128)
----
note: this did not happen from the beginning and does not apply to other languages (or at least not all of them), so it depends which feeds you subscribe to and their (current) content. It is not that easy though to identify which feed causes it, since the update goes through all of them, writes to cache directory, and in the very end tries to assemble HTML from cached data, not telling you where exactly it got stuck.
Its a matter of encoding, see my post on the bug
On Thu, Mar 14, 2013 at 6:29 PM, Daniel Zahn dzahn@wikimedia.org wrote:
note: this did not happen from the beginning and does not apply to other languages (or at least not all of them), so it depends which feeds you subscribe to and their (current) content. It is not that easy though to identify which feed causes it, since the update goes through all of them, writes to cache directory, and in the very end tries to assemble HTML from cached data, not telling you where exactly it got stuck.
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Thank you. Meanwhile fixed planet updates via live hack. See ticket for details.
wikitech-l@lists.wikimedia.org