-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
So first thanks a lot for your time and mails!
it does re-loading attempts AND it applies correct
unicode encoding
to the html page contents. Both is not done by urlopen as far as I
know...(?)
Right. Then abstract that stuff out of the Site urlopen into a
seperate module, and use that.
The issue is not the bot logging out, the issue is you're using a
function for something it was never supposed to do. When you do
that, you shouldn't be surprised things break.
This cannot be the way to go. You are right I am "abusing" the function.
Since in the docu/description there is written
"Low-level routine to get a URL from the wiki."
and I am using it for arbitrary (non-wiki) URLs. But there is also
nothing mentioned about any login try to the current (or any other
wiki) at all... The pywikipedia team did a really good job in writing
this function and it seems strange to me to copy the whole function
as it is, just dropping 1 or 2 lines of code to achieve what I need
(and then I would also have to maintain that code in parallel).
As far as I can see at the moment, the problem is the call to
'self._getUserDataOld' at the end. I am not an expert in this, but I
tried to investigate it as good as possible. That is also the reason
why I asked following (may be stupid) questions:
So I do not understand how the initial login (by
cookies) is done
and at what place in the code? Then I do not understand why the
later (re)login is done in a different way? And last I do no
under- stand why 'LoginManager' ask for a password but does not
need it, if there are cookies present? (this requested user input
seems to break my bot then...)
I was able to answer the first question: 'site._load' is resposible for
the very first login AND is also able to re-login for me. 'getUrl' is
NOT able to re-login EVEN when accessing a page from dewiki... AND THIS
SHOULD WORK as far as I can see (so we have a bug here). The other two
questions I was not able to answer myself...
At the moment to me it looks like adding a keyword argument to 'getUrl'
called 'noLogin' similar to 'getSite' preventing 'getUrl' from
calling
'_getUserDataOld' at the end should solve my problem. And this should
not be in any contradiction to 'getUrl' as it is described.
Greetings and have a nice day
DrTrigon
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla -
http://enigmail.mozdev.org/
iEYEARECAAYFAk8lrQcACgkQAXWvBxzBrDCZlQCfaJkYuTjkDkcmDok2cGslmNH2
W5QAoJini5QSuytsMWmzmahAswnbJPkR
=sQ+7
-----END PGP SIGNATURE-----