Patches item #3038426, was opened at 2010-08-02 20:45
Message generated for change (Comment added) made by xqt
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=3038426&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
>Status: Closed
>Resolution: Accepted
Priority: 5
Private: No
Submitted By: Nobody/Anonymous (nobody)
>Assigned to: xqt (xqt)
Summary: more ignore templates for commonscat
Initial Comment:
please add ignore templates for ru and tt in commonscat
----------------------------------------------------------------------
>Comment By: xqt (xqt)
Date: 2010-08-06 21:46
Message:
done in r8380. thanks.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=3038426&group_…
Bugs item #2901691, was opened at 2009-11-21 11:59
Message generated for change (Comment added) made by xqt
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=2901691&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Resolution: None
>Priority: 6
Private: No
Submitted By: Nobody/Anonymous (nobody)
Assigned to: Nobody/Anonymous (nobody)
Summary: cosmetic.py - stubs and spaces
Initial Comment:
cosmetic.py adds spaces in header delimiters - not a big deal but both styles are acceptable on en, and unspaced is the de-facto for articles by 3:1.
cosmetic.py moves categories to their correct place - ''except'' that the stub template should come ''after'' the categories http://en.wikipedia.org/wiki/Wikipedia:Stub#How_to_mark_an_article_as_a_stub
----------------------------------------------------------------------
>Comment By: xqt (xqt)
Date: 2010-08-06 09:19
Message:
This also applies to th-wiki, see
http://de.wikipedia.org/wiki/Benutzer_Diskussion:Xqt#About_Xqbot
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=2901691&group_…
Bugs item #3038687, was opened at 2010-08-03 10:31
Message generated for change (Comment added) made by xqt
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=3038687&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
>Status: Closed
>Resolution: Fixed
Priority: 5
Private: No
Submitted By: JAn (jandudik)
>Assigned to: xqt (xqt)
Summary: cs disambiguations templates:
Initial Comment:
please, change in wikipedia_family.py in section
self.disambiguationTemplates =
cs: to None
----------------------------------------------------------------------
>Comment By: xqt (xqt)
Date: 2010-08-06 08:18
Message:
done in r8379
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=3038687&group_…
Support Requests item #3019475, was opened at 2010-06-22 08:59
Message generated for change (Comment added) made by
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603139&aid=3019475&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Install Problem
Group: None
>Status: Closed
Priority: 5
Private: No
Submitted By: https://www.google.com/accounts ()
Assigned to: Nobody/Anonymous (nobody)
>Summary: No JSON object could be decoded [FIXED]
Initial Comment:
On Ubuntu 10.04, Karmic LAMP (PHP 5.2) Python 2.6.5, pywikipediabot from 2010-05-29 SVN, using identical server and bot configuration files as on a Mac setup (however, in this case, pywikipediabot reports an IP address, so I didn't need to hack httpd.conf), I get the following:
"Logging into FamilyName:en as UserName via API
Error downloading data: No JSON object could be decoded
Request en:/scriptpath/api.php?
Retrying in x seconds
I changed this to milliseconds to timely see the final error message, which is:
ERROR: ApiGetDataParse cause error No JSON object could be decoded
The program also creates a dump file containing the following:
Error reported: No JSON object could be decoded
127.0.0.1
/scriptpath/api.php?
<feff>{"login":{"result":"NeedToken","token":"[some md5-looking hash]"}}
Any ideas?
----------------------------------------------------------------------
>Comment By: https://www.google.com/accounts ()
Date: 2010-08-06 06:05
Message:
Finally!!!! The problem is that DynamicPageList extension had BOMs at the
beginning of its initialization file. Because this is a "require_once"
extension, it seems that the BOM was getting inserted into the headers, and
Ubuntu's version of PHP or Apache (not sure which) does not sanitize those,
whereas the Mac (and seemingly, everyone else's installation) DOES sanitize
the BOMs before parsing. I am not sure why BeautifulSoup.py doesn't catch
this, but for whatever reason it doesn't. Unless you're using UTF-16 files,
you really shouldn't have a BOM anyway...
To check if you have any stray BOM's laying around, Mediawiki has actually
included a handy script in the t/maint directory called "bom.t"
If you're curious, go to your main MediaWiki directory, then "perl
t/maint/bom.t", and it will tell you which files are problematic.
If you just want to blast away and fix the problem, a combination of two
handy scripts took care of the problem for me. Put one or both in an
executable path, but be sure modify the shell script to refer to the
absolute path to the Perl script:
This one I call "RecursiveBOMDefuse.sh"
#!/bin/sh
#
if [ "$1" = "" ] ; then
echo "Usage: $0 directory"
exit
fi
# Get list of files in the directory
find "$1" -type f |
while read Name ; do
# Based on the file name, perform the conversion
case "$Name" in
(*) # markup text
NameTxt="${Name}"
/absolute/path/to/./BOMdefuse.plx "$NameTxt";
#alternatively, you could use perl /absolute/path/to/BOMdefuse.plx
"$NameTxt";
;;
esac
done
The next, I call BOMdefuse.plx, which is a perl script I found at W3C's
website - I'm really not sure why they haven't made this operate
recursively, but the shell takes care of that. If I had the time, I'd fix
the Perl script to handle everything, but I'm just so happy about getting
the bot working again that I'm going back to work on editing/cleaning up
content.
#!/usr/bin/perl
# program to remove a leading UTF-8 BOM from a file
# works both STDIN -> STDOUT and on the spot (with filename as argument)
# from http://people.w3.org/rishida/blog/?p=102
#
if ($#ARGV > 0) {
print STDERR "Too many arguments!\n";
exit;
}
my @file; # file content
my $lineno = 0;
my $filename = @ARGV[0];
if ($filename) {
open( BOMFILE, $filename ) || die "Could not open source file for
reading.";
while (<BOMFILE>) {
if ($lineno++ == 0) {
if ( index( $_, '' ) == 0 ) {
s/^\xEF\xBB\xBF//;
print "BOM found and removed.\n";
}
else { print "No BOM found.\n"; }
}
push @file, $_ ;
}
close (BOMFILE) || die "Can't close source file after reading.";
open (NOBOMFILE, ">$filename") || die "Could not open source file for
writing.";
foreach $line (@file) {
print NOBOMFILE $line;
}
close (NOBOMFILE) || die "Can't close source file after writing.";
}
else { # STDIN -> STDOUT
while (<>) {
if (!$lineno++) {
s/^\xEF\xBB\xBF//;
}
push @file, $_ ;
}
foreach $line (@file) {
print $line;
}
}
Obviously, run a chmod +x on both of these.
then go to your main Mediawiki directory and run "RecursiveBOMDefuse.sh ."
- it may take a minute or two, but it works!
Note: If you use symlinks anywhere in your installation, the script above
does not seem to follow them, so you have to run the script from the actual
directory. Although slightly annoying, this is probably a good thing, as a
bed set of symlinks could send this script off to run through your entire
drive (or if you're on a system with NFS mounts, the whole
network/cluster!!!).
I hope this helps others, and Ubuntu or Pywikipediabot folks, please take
a look at your PHP/Apache and BeautifulSoup.py - stray BOMs should not be
getting through..... (Of course, extension authors should sanitize their
extensions first, but talk about herding cats).
-Alex
----------------------------------------------------------------------
Comment By: https://www.google.com/accounts ()
Date: 2010-06-29 15:01
Message:
Still doesn't work with
Pywikipediabot (r8335 (wikipedia.py), 2010/06/26, 10:07:01)
Python 2.6.5 (r265:79063, Apr 16 2010, 13:09:56)
[GCC 4.4.3]
(or python 2.5.4)
----------------------------------------------------------------------
Comment By: https://www.google.com/accounts ()
Date: 2010-06-22 17:21
Message:
Thanks for the suggestions and thanks for taking a look.
I'm using the stock 3321-byte api.php from MediaWiki 1.15.4, downloaded
straight from mediawiki.org, dated 2009-05-05 (extracted from the tarball
via tar zxf). I am using a default (apt-get) install of python 2.6.4 on a
fresh install of Ubuntu 10.04, and I just checked out the latest
pywikipediabot from svn via svn co
http://svn.wikimedia.org/svnroot/pywikipedia/trunk/pywikipedia pywikipedia
several hours ago. I've disabled the confusing mess that is AppArmor, so
there should be no issues there. My terminal is set to UTF-8 encoding.
I get the same problem with python 2.5.4 (e.g., "python2.5 login.py"), but
only on this particular machine.
I have made no changes to urllib2, which is what login.py imports by
default, and I have made no changes to urllib, which is what a default
family file imports.
The family file I am using was created on a Mac in vim. As far as I know,
vim doesn't add UTF-16 BOMs unless explicitly asked to do so, and I have
not explicitly done that. Just in case, on the linux box, I created a new
file and copy-pasted the family file text into it, renamed the old one,
renamed the new one properly, deleted all .pyc files, and I still get this
error. I have changed urllib2 to urllib and vice versa in each, both, and
neither of login.py and the family file, all with the same result.
Here is some more error output, although I am not sure if it helps:
ERROR: ApiGetDataParse caused error No JSON object could be decoded
127.0.0.1
/scriptpath/api.php?. Dump
ApiGetDataParse_FamilyName_en__Tue_Jun_22_18-54-23_2010.dump created.
Traceback (most recent call last):
File "login.py", line 437, in <module>
main()
File "login.py", line 433, in main
loginMan.login()
File "login.py", line 320, in login
cookiedata = self.getCookie(api)
File "login.py", line 182, in getCookie
response, data = query.GetData(predata, self.site, sysop=self.sysop,
back_response = True)
File "/home/user/bots/pywikipedia/query.py", line 170, in GetData
raise lastError
ValueError: No JSON object could be decoded
It looks like BeautifulSoup.py (starting at 1828) should strip out any
<feff> BOMs and replace them with null characters, but it doesn't seem to
be doing that.
I'm using completely stock installs of everything, straight from svn,
repositories, and official websites. My family file is built straight from
the template, and it is identical to the one that works on the Mac and on
an Ubuntu 8.04 install of the same wiki.
I have tried
python login.py -v -clean
and I get the following when viewing the dumpfile via cat:
Error reported: No JSON object could be decoded
127.0.0.1
/hcrscript/api.php?action=logout&format=json
[]
and this, when viewing the dumpfile in vim:
Error reported: No JSON object could be decoded
127.0.0.1
/hcrscript/api.php?action=logout&format=json
<feff>[]
As for other potentially-relevant info, I am using short URLs via
httpd.conf aliases, but this should make no difference at all, as it works
on other systems running php 5.2 and apache 2.2.
alias /scriptpath /path/to/scriptpath
alias /wiki /path/to/scriptpath/index.php
I have /scriptpath set as as the scriptpath in my family file, and my
api.php call is to '%s/api.php' (I have also tried u'%s/api.php' to try to
get BeautifulSoup to convert any errant unicode - I still get the identical
errors).
My syslog and /var/log/messages show no errors, and apache reports "POST
/hcrscript/api.php HTTP/1.1" 200".
I've tried uncommenting the "raise NotImplementedError" line in my family
file and commenting out use_api_login = True in my user-config.py file (or
leaving it as-is), but this just returns:
API disabled because this site does not support.
Retrying by ordinary way...
Logging in to Wiki:en as UserName
Login failed. Wrong password or CAPTCHA answer?
I'm completely stumped.
Thanks for any suggestions/advice you may have....
----------------------------------------------------------------------
Comment By: Merlijn S. van Deen (valhallasw)
Date: 2010-06-22 09:59
Message:
The <feff> is a UTF-16 BOM. Either urllib was changed, or you made some
change to api.php, accidentally adding it. Could you double-check if your
api.php is unchanged from the original mediawiki files (in other words:
replace it with an orginal from SVN/release)?
----------------------------------------------------------------------
Comment By: https://www.google.com/accounts ()
Date: 2010-06-22 09:06
Message:
Looking at some earlier logs, I see that this problem first appeared when I
upgraded from Python 2.6.1 to 2.6.2 in May. I am surprised that I seem to
be the only person having this problem.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603139&aid=3019475&group_…
Bugs item #3040326, was opened at 2010-08-05 19:04
Message generated for change (Tracker Item Submitted) made by betacommand
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=3040326&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: General
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: betacommand (betacommand)
Assigned to: Nobody/Anonymous (nobody)
Summary: SpamFilter error not raising
Initial Comment:
this is related to https://sourceforge.net/tracker/?func=detail&aid=3028176&group_id=93107&ati… pywikipedia quietly fails when triggering the spam filter
{u'edit': {u'spamblacklist': u'http://www.gamerbrain.net', u'result': u'Failure'}}
is one example of the returned tuple either the API changed or someone broke something
Python 2.6.5 (r265:79063, Apr 16 2010, 13:09:56)
[GCC 4.4.3]
config-settings:
use_api = True
use_api_login = True
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=3040326&group_…
Bugs item #3028176, was opened at 2010-07-12 00:29
Message generated for change (Comment added) made by lusum
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=3028176&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: General
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: lusum (lusum)
Assigned to: Nobody/Anonymous (nobody)
Summary: copyright_put don't put some files
Initial Comment:
I use copyright_put to put the results of copyright.py on it.wikipedia. But sometime the script delete the results without update the page.
I attach an example file of results ( output.txt ) that don't work with the script...
----------------------------------------------------------------------
>Comment By: lusum (lusum)
Date: 2010-08-06 01:21
Message:
It is confirmed, wikipedia.py don't handle spamfilter. I try to put a page
with some spam link, the bot don't recognize the problem:
i print data before this line on def _putPage(self, text, comment=None,
watchArticle=False, minorEdit=True,
newPage=False, token=None, newToken=False, sysop=False,
captcha=None, botflag=True, maxTries=-1):
if 'error' in data:
print data give me: {u'edit': {u'spamblacklist':
u'http://www.gamerbrain.net', u'result': u'Failure'}}
that mean that this is no error in data, but result give failure
----------------------------------------------------------------------
Comment By: Nobody/Anonymous (nobody)
Date: 2010-07-12 12:45
Message:
It seem that pywikipedia or copyright.py don't handle well spamfilter
----------------------------------------------------------------------
Comment By: lusum (lusum)
Date: 2010-07-12 12:35
Message:
Not possible to save page cause antispam filter...
----------------------------------------------------------------------
Comment By: Francesco Cosoleto (cosoleto)
Date: 2010-07-12 02:09
Message:
If you receive a "output.txt deleted" message, then the bug isn't in
"copyright_put" script but in the Page::put() function that returns
succesful instead of raising a "PageNotSaved" error. Do you get some error
message from MediaWiki by manually puting the text to it.wiki site? That
helps to identify the cause.
In spite of this reply, please note that nowadays the copyright scripts
are unmantained, thus you may experience a loss of functionality.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=3028176&group_…
Patches item #3036375, was opened at 2010-07-28 22:59
Message generated for change (Comment added) made by russblau
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=3036375&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
>Category: rewrite
Group: None
>Status: Closed
>Resolution: Accepted
Priority: 5
Private: No
Submitted By: Nakor Wikipedia (nakor-wikipedia)
Assigned to: Russell Blau (russblau)
Summary: page.fullVersionHistory does not return a value
Initial Comment:
Attached is a patch to have the same output for page.fullVersionHistory as in pywikipedia
----------------------------------------------------------------------
>Comment By: Russell Blau (russblau)
Date: 2010-08-04 10:09
Message:
Applied in r8377
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=3036375&group_…
Bugs item #3036372, was opened at 2010-07-28 22:41
Message generated for change (Comment added) made by russblau
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=3036372&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: rewrite
Group: None
>Status: Closed
>Resolution: Fixed
Priority: 5
Private: No
Submitted By: Nobody/Anonymous (nobody)
Assigned to: Russell Blau (russblau)
Summary: MIssing argument in page.put
Initial Comment:
The argument botflag is missing from page.put in pywikibot.
This argument allows the bot to perform an edit which will not have the bot flag.
----------------------------------------------------------------------
>Comment By: Russell Blau (russblau)
Date: 2010-08-04 10:03
Message:
done in r8376
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=3036372&group_…