Bugs item #3538008, was opened at 2012-06-25 21:52
Message generated for change (Comment added) made by nobody
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=3538008&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: James (jaclayiii)
Assigned to: Nobody/Anonymous (nobody)
Summary: *-login.data can have case discrepency on Linux host
Initial Comment:
Pywikipedia [http] trunk/pywikipedia (r10401, 2012/06/21, 06:18:43)
Python 2.7.2+ (default, Oct 4 2011, 20:06:09)
[GCC 4.6.1]
config-settings:
use_api = True
use_api_login = True
unicode test: ok
Summary: the *-login.data file maybe saved with an uppercase username but when load cookies tries to find it on a Linux host, the case of the username maybe lower. This has the unintended consequence of not allowing bots to login on private wikis that have anonymous read api rights disabled.
If a user connects to a wiki that has LDAP or some other form of "add-on" authentication, the user name returned may vary in case from what is in the user-config.py file. The reason this matters is that the <wikifamily>-<language>-<username>-login.data file may be saved with an upper case letter in the username. Thus if the user-config.py file contained:
users["mywiki"]["en"]="james"
but the LDAP authenticator returned back "James" as the username, then the *-login.data file would be mywiki-en-James-login.data, but when _loadcookies goes to look for such a file on line 5572:
if os.path.exists(localPA)
localPA is /~some/path/to/mywiki-en-james-login.data
Notice that the James is now lower case in the file above.
As Linux is case sensitive, it cannot find the login data and thus prevents access to wikis the do not allow anonymous access to api's. A temporary work around requires setting user name to the appropriate case (even if the username is case insensitive in the LDAP authentication scheme), for example:
users["mywiki"]["en"]="James"
keywords: SSL, Login failure, https login failure, https linux login, https pywikipedia, https pywikipedia linux
----------------------------------------------------------------------
Comment By: Nobody/Anonymous (nobody)
Date: 2013-02-12 19:23
Message:
abEtXp <a href="http://clocxxourmmh.com/">clocxxourmmh</a>,
[url=http://mhincreqjohj.com/]mhincreqjohj[/url],
[link=http://hpigicxvkgpm.com/]hpigicxvkgpm[/link],
http://tjgknhnocqum.com/
----------------------------------------------------------------------
Comment By: James (jaclayiii)
Date: 2012-06-26 15:20
Message:
After thinking even more on this issue, even for those not using LDAP
authentication (which I would assume the majority are not using), correct
casing based on the user-config file shouldn't have undesirable effects: if
you can log in with what's in the user config file then correctly saving
the cookie file with that username shouldn't negatively impact anything. On
the other hand if you do not save the cookie file with the same user name
that is in the user-config file, but you continue to use the user-config
file to generate the localPA variable, then you may have problems on case
sensitive platforms.
If this fix seems to difficult (I don't believe it to be) or you're
suspicious of the logic, you may want to place a warning in the setup
instructions. I've added a comment on the wiki for userconfig.py that
people using *nix systems should be aware that by default mediawiki has
uppercase user names.
----------------------------------------------------------------------
Comment By: James (jaclayiii)
Date: 2012-06-26 12:08
Message:
After rereading the LDAP link, you're probably right in that the it's the
actual mediawiki login that is forcing uppercase, nonetheless the file name
that pywiki attempts to find should be case correct irrelevant of the
username supplied or returned. My thought for that fix has to do with
correctly saving the *-login.data cookie with the username found in
user-config.py.
----------------------------------------------------------------------
Comment By: James (jaclayiii)
Date: 2012-06-26 11:59
Message:
It has very much to do with LDAP:
http://www.mediawiki.org/wiki/Extension:LDAP_Authentication/User_Provided_I…
And it has very much to do with Linux: Linux path names are case sensitive.
I reported the bug as it took me time to track down and perhaps someone
else who has the misfortune of dealing with it will find this helpful.
The fact is that if I can login with a lower case name, and I can, then
whatever pywiki stores should be in the same case, NOT what may be returned
when the wiki user name is returned. The file name for *-login.data should
be the same case as what is stored in user-config.py
This is just good practice, especially on a case sensitive host like
Linux.
Also, as mediawiki is authenticating against LDAP, whatever it stores as
the username is irrelevant if it correctly authenticates. A further reason
to enforce correct casing based on the user-config.py file.
----------------------------------------------------------------------
Comment By: Merlijn S. van Deen (valhallasw)
Date: 2012-06-26 03:27
Message:
This has very little to do with Linux or LDAP, but rather has to do with
the fact your username is 'James' and not 'james'. This is related to the
'first character is capitalized' convention on some wikis, but not all
(!).
However, we could probably check whether the name has changed when the user
is logged in and emit a warning when this happens (and/or store the cookie
with the username as saved in the config file, but that could have some
unintended consequences).
----------------------------------------------------------------------
Comment By: James (jaclayiii)
Date: 2012-06-25 21:56
Message:
Quick comment: _loadCookies() is in wikipedia.py on line 5534
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=3538008&group_…
Bugs item #3604218, was opened at 2013-02-12 01:49
Message generated for change (Tracker Item Submitted) made by nobody
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=3604218&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: category
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Nobody/Anonymous (nobody)
Assigned to: Nobody/Anonymous (nobody)
Summary: sortkeyprefix: r11013 breaks catlib.py with mw 1.16
Initial Comment:
Retrieving a categories' non-empt subcategories under mw 1.16 results in
> KeyError: 'sortkeyprefix'
API response
> Unrecognized value for parameter 'cmprop': sortkeyprefix
First added to the patches tracker by mistake (ID: 3603953)
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=3604218&group_…
Support Requests item #3019475, was opened at 2010-06-22 01:59
Message generated for change (Comment added) made by nobody
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603139&aid=3019475&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Install Problem
Group: None
Status: Closed
Priority: 5
Private: No
Submitted By: https://www.google.com/accounts ()
Assigned to: Nobody/Anonymous (nobody)
Summary: No JSON object could be decoded [FIXED]
Initial Comment:
On Ubuntu 10.04, Karmic LAMP (PHP 5.2) Python 2.6.5, pywikipediabot from 2010-05-29 SVN, using identical server and bot configuration files as on a Mac setup (however, in this case, pywikipediabot reports an IP address, so I didn't need to hack httpd.conf), I get the following:
"Logging into FamilyName:en as UserName via API
Error downloading data: No JSON object could be decoded
Request en:/scriptpath/api.php?
Retrying in x seconds
I changed this to milliseconds to timely see the final error message, which is:
ERROR: ApiGetDataParse cause error No JSON object could be decoded
The program also creates a dump file containing the following:
Error reported: No JSON object could be decoded
127.0.0.1
/scriptpath/api.php?
<feff>{"login":{"result":"NeedToken","token":"[some md5-looking hash]"}}
Any ideas?
----------------------------------------------------------------------
Comment By: Nobody/Anonymous (nobody)
Date: 2013-02-11 23:20
Message:
qyKPHQ <a href="http://qotfuxuzbwsq.com/">qotfuxuzbwsq</a>,
[url=http://hubshgeqqgle.com/]hubshgeqqgle[/url],
[link=http://svwbinapsgsz.com/]svwbinapsgsz[/link],
http://yyoxahncgjpv.com/
----------------------------------------------------------------------
Comment By: https://www.google.com/accounts ()
Date: 2012-04-13 01:34
Message:
Just to share :
I got the issue too. The UTF-16 BOM was inserted in LocalSettings.php
(edited by a MS Windows user).
I removed it and now everything works fine.
----------------------------------------------------------------------
Comment By: https://www.google.com/accounts ()
Date: 2010-08-05 23:05
Message:
Finally!!!! The problem is that DynamicPageList extension had BOMs at the
beginning of its initialization file. Because this is a "require_once"
extension, it seems that the BOM was getting inserted into the headers, and
Ubuntu's version of PHP or Apache (not sure which) does not sanitize those,
whereas the Mac (and seemingly, everyone else's installation) DOES sanitize
the BOMs before parsing. I am not sure why BeautifulSoup.py doesn't catch
this, but for whatever reason it doesn't. Unless you're using UTF-16 files,
you really shouldn't have a BOM anyway...
To check if you have any stray BOM's laying around, Mediawiki has actually
included a handy script in the t/maint directory called "bom.t"
If you're curious, go to your main MediaWiki directory, then "perl
t/maint/bom.t", and it will tell you which files are problematic.
If you just want to blast away and fix the problem, a combination of two
handy scripts took care of the problem for me. Put one or both in an
executable path, but be sure modify the shell script to refer to the
absolute path to the Perl script:
This one I call "RecursiveBOMDefuse.sh"
#!/bin/sh
#
if [ "$1" = "" ] ; then
echo "Usage: $0 directory"
exit
fi
# Get list of files in the directory
find "$1" -type f |
while read Name ; do
# Based on the file name, perform the conversion
case "$Name" in
(*) # markup text
NameTxt="${Name}"
/absolute/path/to/./BOMdefuse.plx "$NameTxt";
#alternatively, you could use perl /absolute/path/to/BOMdefuse.plx
"$NameTxt";
;;
esac
done
The next, I call BOMdefuse.plx, which is a perl script I found at W3C's
website - I'm really not sure why they haven't made this operate
recursively, but the shell takes care of that. If I had the time, I'd fix
the Perl script to handle everything, but I'm just so happy about getting
the bot working again that I'm going back to work on editing/cleaning up
content.
#!/usr/bin/perl
# program to remove a leading UTF-8 BOM from a file
# works both STDIN -> STDOUT and on the spot (with filename as argument)
# from http://people.w3.org/rishida/blog/?p=102
#
if ($#ARGV > 0) {
print STDERR "Too many arguments!\n";
exit;
}
my @file; # file content
my $lineno = 0;
my $filename = @ARGV[0];
if ($filename) {
open( BOMFILE, $filename ) || die "Could not open source file for
reading.";
while (<BOMFILE>) {
if ($lineno++ == 0) {
if ( index( $_, '' ) == 0 ) {
s/^\xEF\xBB\xBF//;
print "BOM found and removed.\n";
}
else { print "No BOM found.\n"; }
}
push @file, $_ ;
}
close (BOMFILE) || die "Can't close source file after reading.";
open (NOBOMFILE, ">$filename") || die "Could not open source file for
writing.";
foreach $line (@file) {
print NOBOMFILE $line;
}
close (NOBOMFILE) || die "Can't close source file after writing.";
}
else { # STDIN -> STDOUT
while (<>) {
if (!$lineno++) {
s/^\xEF\xBB\xBF//;
}
push @file, $_ ;
}
foreach $line (@file) {
print $line;
}
}
Obviously, run a chmod +x on both of these.
then go to your main Mediawiki directory and run "RecursiveBOMDefuse.sh ."
- it may take a minute or two, but it works!
Note: If you use symlinks anywhere in your installation, the script above
does not seem to follow them, so you have to run the script from the actual
directory. Although slightly annoying, this is probably a good thing, as a
bed set of symlinks could send this script off to run through your entire
drive (or if you're on a system with NFS mounts, the whole
network/cluster!!!).
I hope this helps others, and Ubuntu or Pywikipediabot folks, please take a
look at your PHP/Apache and BeautifulSoup.py - stray BOMs should not be
getting through..... (Of course, extension authors should sanitize their
extensions first, but talk about herding cats).
-Alex
----------------------------------------------------------------------
Comment By: https://www.google.com/accounts ()
Date: 2010-06-29 08:01
Message:
Still doesn't work with
Pywikipediabot (r8335 (wikipedia.py), 2010/06/26, 10:07:01)
Python 2.6.5 (r265:79063, Apr 16 2010, 13:09:56)
[GCC 4.4.3]
(or python 2.5.4)
----------------------------------------------------------------------
Comment By: https://www.google.com/accounts ()
Date: 2010-06-22 10:21
Message:
Thanks for the suggestions and thanks for taking a look.
I'm using the stock 3321-byte api.php from MediaWiki 1.15.4, downloaded
straight from mediawiki.org, dated 2009-05-05 (extracted from the tarball
via tar zxf). I am using a default (apt-get) install of python 2.6.4 on a
fresh install of Ubuntu 10.04, and I just checked out the latest
pywikipediabot from svn via svn co
http://svn.wikimedia.org/svnroot/pywikipedia/trunk/pywikipedia pywikipedia
several hours ago. I've disabled the confusing mess that is AppArmor, so
there should be no issues there. My terminal is set to UTF-8 encoding.
I get the same problem with python 2.5.4 (e.g., "python2.5 login.py"), but
only on this particular machine.
I have made no changes to urllib2, which is what login.py imports by
default, and I have made no changes to urllib, which is what a default
family file imports.
The family file I am using was created on a Mac in vim. As far as I know,
vim doesn't add UTF-16 BOMs unless explicitly asked to do so, and I have
not explicitly done that. Just in case, on the linux box, I created a new
file and copy-pasted the family file text into it, renamed the old one,
renamed the new one properly, deleted all .pyc files, and I still get this
error. I have changed urllib2 to urllib and vice versa in each, both, and
neither of login.py and the family file, all with the same result.
Here is some more error output, although I am not sure if it helps:
ERROR: ApiGetDataParse caused error No JSON object could be decoded
127.0.0.1
/scriptpath/api.php?. Dump
ApiGetDataParse_FamilyName_en__Tue_Jun_22_18-54-23_2010.dump created.
Traceback (most recent call last):
File "login.py", line 437, in <module>
main()
File "login.py", line 433, in main
loginMan.login()
File "login.py", line 320, in login
cookiedata = self.getCookie(api)
File "login.py", line 182, in getCookie
response, data = query.GetData(predata, self.site, sysop=self.sysop,
back_response = True)
File "/home/user/bots/pywikipedia/query.py", line 170, in GetData
raise lastError
ValueError: No JSON object could be decoded
It looks like BeautifulSoup.py (starting at 1828) should strip out any
<feff> BOMs and replace them with null characters, but it doesn't seem to
be doing that.
I'm using completely stock installs of everything, straight from svn,
repositories, and official websites. My family file is built straight from
the template, and it is identical to the one that works on the Mac and on
an Ubuntu 8.04 install of the same wiki.
I have tried
python login.py -v -clean
and I get the following when viewing the dumpfile via cat:
Error reported: No JSON object could be decoded
127.0.0.1
/hcrscript/api.php?action=logout&format=json
[]
and this, when viewing the dumpfile in vim:
Error reported: No JSON object could be decoded
127.0.0.1
/hcrscript/api.php?action=logout&format=json
<feff>[]
As for other potentially-relevant info, I am using short URLs via
httpd.conf aliases, but this should make no difference at all, as it works
on other systems running php 5.2 and apache 2.2.
alias /scriptpath /path/to/scriptpath
alias /wiki /path/to/scriptpath/index.php
I have /scriptpath set as as the scriptpath in my family file, and my
api.php call is to '%s/api.php' (I have also tried u'%s/api.php' to try to
get BeautifulSoup to convert any errant unicode - I still get the identical
errors).
My syslog and /var/log/messages show no errors, and apache reports "POST
/hcrscript/api.php HTTP/1.1" 200".
I've tried uncommenting the "raise NotImplementedError" line in my family
file and commenting out use_api_login = True in my user-config.py file (or
leaving it as-is), but this just returns:
API disabled because this site does not support.
Retrying by ordinary way...
Logging in to Wiki:en as UserName
Login failed. Wrong password or CAPTCHA answer?
I'm completely stumped.
Thanks for any suggestions/advice you may have....
----------------------------------------------------------------------
Comment By: Merlijn S. van Deen (valhallasw)
Date: 2010-06-22 02:59
Message:
The <feff> is a UTF-16 BOM. Either urllib was changed, or you made some
change to api.php, accidentally adding it. Could you double-check if your
api.php is unchanged from the original mediawiki files (in other words:
replace it with an orginal from SVN/release)?
----------------------------------------------------------------------
Comment By: https://www.google.com/accounts ()
Date: 2010-06-22 02:06
Message:
Looking at some earlier logs, I see that this problem first appeared when I
upgraded from Python 2.6.1 to 2.6.2 in May. I am surprised that I seem to
be the only person having this problem.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603139&aid=3019475&group_…
Bugs item #3604180, was opened at 2013-02-11 18:42
Message generated for change (Comment added) made by
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=3604180&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: redirect
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Riley ()
Assigned to: Nobody/Anonymous (nobody)
Summary: Unicode issue running redirect.py
Initial Comment:
Hello everyone, I am having a unicode issue when running redirect.py on wikisource.org. When running the script, pywikipediabot seems to try to change the page names into english.
----------------------------------------------------------------------
>Comment By: Riley ()
Date: 2013-02-11 18:44
Message:
I clicked save before I was done writing -.-;
As also noticed in the provided screenshot; the script doesn't save nor
give output of any kind.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=3604180&group_…
Bugs item #3604180, was opened at 2013-02-11 18:42
Message generated for change (Tracker Item Submitted) made by
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=3604180&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: redirect
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Riley ()
Assigned to: Nobody/Anonymous (nobody)
Summary: Unicode issue running redirect.py
Initial Comment:
Hello everyone, I am having a unicode issue when running redirect.py on wikisource.org. When running the script, pywikipediabot seems to try to change the page names into english.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=3604180&group_…
Feature Requests item #3604079, was opened at 2013-02-11 03:45
Message generated for change (Tracker Item Submitted) made by jandudik
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603141&aid=3604079&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: JAn (jandudik)
Assigned to: Nobody/Anonymous (nobody)
Summary: Wikiata filling bot
Initial Comment:
It would be very useful for second phase of Wikidata something like this:
fill_wikidata.py -template:geobox -parameter:area -to:P123
which will goes through pages containing {{geobox}} and copy value from parameter 'area' to Wikidata property P123 of related article. Parameter will tell if overwrite or if only add when missing.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603141&aid=3604079&group_…
Bugs item #3604077, was opened at 2013-02-11 03:41
Message generated for change (Tracker Item Submitted) made by jandudik
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=3604077&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: login
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: JAn (jandudik)
Assigned to: Nobody/Anonymous (nobody)
Summary: Assertion error
Initial Comment:
interwiki.py -async -cleanup -links:template:Daerah_Hradec_Králové -lang:ms -untranslated -initialredirect
...
Getting 15 pages from wikipedia:ms...
Dump ms (wikipedia) appended.
Traceback (most recent call last):
File "D:\Py\interwiki.py", line 2603, in <module>
main()
File "D:\Py\interwiki.py", line 2577, in main
bot.run()
File "D:\Py\interwiki.py", line 2310, in run
self.queryStep()
File "D:\Py\interwiki.py", line 2283, in queryStep
self.oneQuery()
File "D:\Py\interwiki.py", line 2279, in oneQuery
subject.batchLoaded(self)
File "D:\Py\interwiki.py", line 1216, in batchLoaded
self.done.add(page)
File "D:\Py\interwiki.py", line 733, in add
assert page not in self.tree[site]
AssertionError
D:\Py>version.py
Pywikipedia trunk/pywikipedia/ (r11072, 2013/02/10, 16:52:07, ok)
Python 2.7.3 (default, Apr 10 2012, 23:31:26) [MSC v.1500 32 bit (Intel)]
config-settings:
use_api = True
use_api_login = True
unicode test: ok
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=3604077&group_…
Bugs item #3603918, was opened at 2013-02-09 01:45
Message generated for change (Comment added) made by
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=3603918&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: login
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Riley ()
Assigned to: Nobody/Anonymous (nobody)
Summary: Can't login to wikisource.org
Initial Comment:
I would like to be able to edit Wikisource with pywikipediabot; unlike all the other wikis, http://wikisource.org/ does not redirect to http://en.wikisource.org/ and thus I cannot login to wikisource.org without playing with each script.
The following was outputted when I tried
family = 'wikisource'
mylang = ''
Traceback (most recent call last):
File "C:\Python27\pywikipedia1\redirect.py", line 65, in <module>
import wikipedia as pywikibot
File "C:\Python27\pywikipedia1\wikipedia.py", line 8717, in <module>
getSite(noLogin=True)
File "C:\Python27\pywikipedia1\pywikibot\support.py", line 115, in wrapper
return method(*__args, **__kw)
File "C:\Python27\pywikipedia1\wikipedia.py", line 8471, in getSite
_sites[key] = Site(code=code, fam=fam, user=user)
File "C:\Python27\pywikipedia1\pywikibot\support.py", line 115, in wrapper
return method(*__args, **__kw)
File "C:\Python27\pywikipedia1\wikipedia.py", line 5667, in __init__
% (self.__code, self.__family.name))
NoSuchSite: Language does not exist in family wikisource
----------------------------------------------------------------------
Comment By: Riley ()
Date: 2013-02-10 15:26
Message:
Suggested method works, please close or possibly add a note somewhere in
the code? (so people like me can see that they need to do '-')
----------------------------------------------------------------------
Comment By: Riley ()
Date: 2013-02-10 15:25
Message:
My mistake, I didn't see the suggested method.
----------------------------------------------------------------------
Comment By: Merlijn S. van Deen (valhallasw)
Date: 2013-02-10 15:23
Message:
Please try the suggested method before increading priority.
----------------------------------------------------------------------
Comment By: Merlijn S. van Deen (valhallasw)
Date: 2013-02-09 03:25
Message:
Instead of mylang='', try using mylang='-' (and also adapting your username
config as such).
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=3603918&group_…