Bugs item #1852173, was opened at 2007-12-17 08:57 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1852173...
Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: other Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Purodha B Blissenbach (purodha) Assigned to: Nobody/Anonymous (nobody) Summary: utf-8 coding problem, kills weblinkchecker
Initial Comment: Weblinkchecker chokes on many instances when reading the Special:Allpages of the ksh Wikipedia. It claims to see non-unicode data, which is unlikely to be so. I did not dig the code in detail atm, and I cannot tell the offending byte sequences atm.
Here is a copy of a linux command line, and the output generated by it:
$~> python weblinkchecker.py -putthrottle:300 -start:00er -v -family:wikipedia -lang:ksh Checked for running processes. 1 processes currently running, including the current process. Pywikipediabot (r4720 (wikipedia.py), Dec 15 2007, 18:57:27) Python 2.4.4 (#2, Aug 16 2007, 00:34:54) [GCC 4.1.3 20070812 (prerelease) (Debian 4.1.2-15)] Retrieving Allpages special page for wikipedia:ksh from 00er, namespace 0 Retrieving Allpages special page for wikipedia:ksh from 00er%20Joare%20%28Watt%20%C4%97%C3%9F%C3%9F%20datt%3F%29%21, namespace 0 Retrieving Allpages special page for wikipedia:ksh from 00er%2520Joare%2520%2528Watt%2520%25C4%2597%25C3%259F%25C3%259F%2520datt%253F%2529%2521, namespace 0 DBG> BUG: Non-unicode passed to wikipedia.output without decoder! File "threading.py", line 442, in __bootstrap self.run() File "/home/purodha/pywikipedia/pagegenerators.py", line 632, in run wikipedia.output(str(e)) File "/home/purodha/pywikipedia/wikipedia.py", line 5351, in output print traceback.print_stack() None DBG> Attempting to recover, but please report this problem Couldn't extract allpages special page. Make sure you're using MonoBook skin. Saving history...
I had made sure, the user [[:ksh:User:Weblinkchcker]] was logged in, using the monobook skin, and the English interface language.
I could not make sure that weblinkchecker does use his user account while reading only. A test revealed that there is no apparent difference in behaviour when I rename login-data/wikipedia-ksh-Weblinkchecker-login.data to something else.
If there are questions, I am prepared to provide more info, once I know where to look.
----------------------------------------------------------------------
You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1852173...
pywikipedia-l@lists.wikimedia.org