Support Requests item #3019475, was opened at 2010-06-22 01:59 Message generated for change (Comment added) made by nobody You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603139&aid=3019475...
Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Install Problem Group: None Status: Closed Priority: 5 Private: No Submitted By: https://www.google.com/accounts () Assigned to: Nobody/Anonymous (nobody) Summary: No JSON object could be decoded [FIXED]
Initial Comment: On Ubuntu 10.04, Karmic LAMP (PHP 5.2) Python 2.6.5, pywikipediabot from 2010-05-29 SVN, using identical server and bot configuration files as on a Mac setup (however, in this case, pywikipediabot reports an IP address, so I didn't need to hack httpd.conf), I get the following:
"Logging into FamilyName:en as UserName via API Error downloading data: No JSON object could be decoded Request en:/scriptpath/api.php? Retrying in x seconds
I changed this to milliseconds to timely see the final error message, which is:
ERROR: ApiGetDataParse cause error No JSON object could be decoded
The program also creates a dump file containing the following:
Error reported: No JSON object could be decoded 127.0.0.1 /scriptpath/api.php?
<feff>{"login":{"result":"NeedToken","token":"[some md5-looking hash]"}}
Any ideas?
----------------------------------------------------------------------
Comment By: Nobody/Anonymous (nobody) Date: 2013-02-11 23:20
Message: qyKPHQ <a href="http://qotfuxuzbwsq.com/">qotfuxuzbwsq</a>, [url=http://hubshgeqqgle.com/%5Dhubshgeqqgle%5B/url], [link=http://svwbinapsgsz.com/%5Dsvwbinapsgsz%5B/link], http://yyoxahncgjpv.com/
----------------------------------------------------------------------
Comment By: https://www.google.com/accounts () Date: 2012-04-13 01:34
Message: Just to share : I got the issue too. The UTF-16 BOM was inserted in LocalSettings.php (edited by a MS Windows user). I removed it and now everything works fine.
----------------------------------------------------------------------
Comment By: https://www.google.com/accounts () Date: 2010-08-05 23:05
Message: Finally!!!! The problem is that DynamicPageList extension had BOMs at the beginning of its initialization file. Because this is a "require_once" extension, it seems that the BOM was getting inserted into the headers, and Ubuntu's version of PHP or Apache (not sure which) does not sanitize those, whereas the Mac (and seemingly, everyone else's installation) DOES sanitize the BOMs before parsing. I am not sure why BeautifulSoup.py doesn't catch this, but for whatever reason it doesn't. Unless you're using UTF-16 files, you really shouldn't have a BOM anyway...
To check if you have any stray BOM's laying around, Mediawiki has actually included a handy script in the t/maint directory called "bom.t" If you're curious, go to your main MediaWiki directory, then "perl t/maint/bom.t", and it will tell you which files are problematic.
If you just want to blast away and fix the problem, a combination of two handy scripts took care of the problem for me. Put one or both in an executable path, but be sure modify the shell script to refer to the absolute path to the Perl script:
This one I call "RecursiveBOMDefuse.sh"
#!/bin/sh # if [ "$1" = "" ] ; then echo "Usage: $0 directory" exit fi # Get list of files in the directory find "$1" -type f | while read Name ; do # Based on the file name, perform the conversion case "$Name" in (*) # markup text NameTxt="${Name}" /absolute/path/to/./BOMdefuse.plx "$NameTxt"; #alternatively, you could use perl /absolute/path/to/BOMdefuse.plx "$NameTxt"; ;; esac done
The next, I call BOMdefuse.plx, which is a perl script I found at W3C's website - I'm really not sure why they haven't made this operate recursively, but the shell takes care of that. If I had the time, I'd fix the Perl script to handle everything, but I'm just so happy about getting the bot working again that I'm going back to work on editing/cleaning up content.
#!/usr/bin/perl # program to remove a leading UTF-8 BOM from a file # works both STDIN -> STDOUT and on the spot (with filename as argument) # from http://people.w3.org/rishida/blog/?p=102 #
if ($#ARGV > 0) { print STDERR "Too many arguments!\n"; exit; }
my @file; # file content my $lineno = 0;
my $filename = @ARGV[0]; if ($filename) { open( BOMFILE, $filename ) || die "Could not open source file for reading."; while (<BOMFILE>) { if ($lineno++ == 0) { if ( index( $_, '' ) == 0 ) { s/^\xEF\xBB\xBF//; print "BOM found and removed.\n"; } else { print "No BOM found.\n"; } } push @file, $_ ; } close (BOMFILE) || die "Can't close source file after reading.";
open (NOBOMFILE, ">$filename") || die "Could not open source file for writing."; foreach $line (@file) { print NOBOMFILE $line; } close (NOBOMFILE) || die "Can't close source file after writing."; } else { # STDIN -> STDOUT while (<>) { if (!$lineno++) { s/^\xEF\xBB\xBF//; } push @file, $_ ; }
foreach $line (@file) { print $line; } }
Obviously, run a chmod +x on both of these.
then go to your main Mediawiki directory and run "RecursiveBOMDefuse.sh ." - it may take a minute or two, but it works!
Note: If you use symlinks anywhere in your installation, the script above does not seem to follow them, so you have to run the script from the actual directory. Although slightly annoying, this is probably a good thing, as a bed set of symlinks could send this script off to run through your entire drive (or if you're on a system with NFS mounts, the whole network/cluster!!!).
I hope this helps others, and Ubuntu or Pywikipediabot folks, please take a look at your PHP/Apache and BeautifulSoup.py - stray BOMs should not be getting through..... (Of course, extension authors should sanitize their extensions first, but talk about herding cats).
-Alex
----------------------------------------------------------------------
Comment By: https://www.google.com/accounts () Date: 2010-06-29 08:01
Message: Still doesn't work with
Pywikipediabot (r8335 (wikipedia.py), 2010/06/26, 10:07:01) Python 2.6.5 (r265:79063, Apr 16 2010, 13:09:56) [GCC 4.4.3]
(or python 2.5.4)
----------------------------------------------------------------------
Comment By: https://www.google.com/accounts () Date: 2010-06-22 10:21
Message: Thanks for the suggestions and thanks for taking a look.
I'm using the stock 3321-byte api.php from MediaWiki 1.15.4, downloaded straight from mediawiki.org, dated 2009-05-05 (extracted from the tarball via tar zxf). I am using a default (apt-get) install of python 2.6.4 on a fresh install of Ubuntu 10.04, and I just checked out the latest pywikipediabot from svn via svn co http://svn.wikimedia.org/svnroot/pywikipedia/trunk/pywikipedia pywikipedia several hours ago. I've disabled the confusing mess that is AppArmor, so there should be no issues there. My terminal is set to UTF-8 encoding.
I get the same problem with python 2.5.4 (e.g., "python2.5 login.py"), but only on this particular machine.
I have made no changes to urllib2, which is what login.py imports by default, and I have made no changes to urllib, which is what a default family file imports.
The family file I am using was created on a Mac in vim. As far as I know, vim doesn't add UTF-16 BOMs unless explicitly asked to do so, and I have not explicitly done that. Just in case, on the linux box, I created a new file and copy-pasted the family file text into it, renamed the old one, renamed the new one properly, deleted all .pyc files, and I still get this error. I have changed urllib2 to urllib and vice versa in each, both, and neither of login.py and the family file, all with the same result.
Here is some more error output, although I am not sure if it helps:
ERROR: ApiGetDataParse caused error No JSON object could be decoded 127.0.0.1 /scriptpath/api.php?. Dump ApiGetDataParse_FamilyName_en__Tue_Jun_22_18-54-23_2010.dump created. Traceback (most recent call last): File "login.py", line 437, in <module> main() File "login.py", line 433, in main loginMan.login() File "login.py", line 320, in login cookiedata = self.getCookie(api) File "login.py", line 182, in getCookie response, data = query.GetData(predata, self.site, sysop=self.sysop, back_response = True) File "/home/user/bots/pywikipedia/query.py", line 170, in GetData raise lastError ValueError: No JSON object could be decoded
It looks like BeautifulSoup.py (starting at 1828) should strip out any <feff> BOMs and replace them with null characters, but it doesn't seem to be doing that.
I'm using completely stock installs of everything, straight from svn, repositories, and official websites. My family file is built straight from the template, and it is identical to the one that works on the Mac and on an Ubuntu 8.04 install of the same wiki.
I have tried
python login.py -v -clean
and I get the following when viewing the dumpfile via cat:
Error reported: No JSON object could be decoded 127.0.0.1 /hcrscript/api.php?action=logout&format=json
[]
and this, when viewing the dumpfile in vim:
Error reported: No JSON object could be decoded 127.0.0.1 /hcrscript/api.php?action=logout&format=json
<feff>[]
As for other potentially-relevant info, I am using short URLs via httpd.conf aliases, but this should make no difference at all, as it works on other systems running php 5.2 and apache 2.2.
alias /scriptpath /path/to/scriptpath alias /wiki /path/to/scriptpath/index.php
I have /scriptpath set as as the scriptpath in my family file, and my api.php call is to '%s/api.php' (I have also tried u'%s/api.php' to try to get BeautifulSoup to convert any errant unicode - I still get the identical errors).
My syslog and /var/log/messages show no errors, and apache reports "POST /hcrscript/api.php HTTP/1.1" 200".
I've tried uncommenting the "raise NotImplementedError" line in my family file and commenting out use_api_login = True in my user-config.py file (or leaving it as-is), but this just returns:
API disabled because this site does not support. Retrying by ordinary way... Logging in to Wiki:en as UserName Login failed. Wrong password or CAPTCHA answer?
I'm completely stumped.
Thanks for any suggestions/advice you may have....
----------------------------------------------------------------------
Comment By: Merlijn S. van Deen (valhallasw) Date: 2010-06-22 02:59
Message: The <feff> is a UTF-16 BOM. Either urllib was changed, or you made some change to api.php, accidentally adding it. Could you double-check if your api.php is unchanged from the original mediawiki files (in other words: replace it with an orginal from SVN/release)?
----------------------------------------------------------------------
Comment By: https://www.google.com/accounts () Date: 2010-06-22 02:06
Message: Looking at some earlier logs, I see that this problem first appeared when I upgraded from Python 2.6.1 to 2.6.2 in May. I am surprised that I seem to be the only person having this problem.
----------------------------------------------------------------------
You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603139&aid=3019475...
pywikipedia-bugs@lists.wikimedia.org