Revision: 5471
Author: filnik
Date: 2008-05-30 11:32:38 +0000 (Fri, 30 May 2008)
Log Message:
-----------
Adding a new par to getUrl()
Modified Paths:
--------------
trunk/pywikipedia/wikipedia.py
Modified: trunk/pywikipedia/wikipedia.py
===================================================================
--- trunk/pywikipedia/wikipedia.py 2008-05-30 10:31:38 UTC (rev 5470)
+++ trunk/pywikipedia/wikipedia.py 2008-05-30 11:32:38 UTC (rev 5471)
@@ -4215,16 +4215,17 @@
return response, data
- def getUrl(self, path, retry = True, sysop = False, data = None, compress = True):
+ def getUrl(self, path, retry = True, sysop = False, data = None, compress = True, no_hostname = False):
"""
Low-level routine to get a URL from the wiki.
Parameters:
- path - The absolute path, without the hostname.
- retry - If True, retries loading the page when a network error
- occurs.
- sysop - If True, the sysop account's cookie will be used.
- data - An optional dict providing extra post request parameters
+ path - The absolute path, without the hostname.
+ retry - If True, retries loading the page when a network error
+ occurs.
+ sysop - If True, the sysop account's cookie will be used.
+ data - An optional dict providing extra post request parameters.
+ no_hostname - Open the URL given, don't add the hostname before.
Returns the HTML text of the page converted to unicode.
"""
@@ -4260,8 +4261,10 @@
uo.addheader('Cookie', self.cookies(sysop = sysop))
if compress:
uo.addheader('Accept-encoding', 'gzip')
-
- url = '%s://%s%s' % (self.protocol(), self.hostname(), path)
+ if no_hostname == True: # This allow users to parse also toolserver's script
+ url = path # and other useful pages without using some other functions.
+ else:
+ url = '%s://%s%s' % (self.protocol(), self.hostname(), path)
data = self.urlEncode(data)
# Try to retrieve the page until it was successfully loaded (just in
Bugs item #1977728, was opened at 2008-05-29 15:20
Message generated for change (Comment added) made by rotemliss
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1977728&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: interwiki
Group: None
>Status: Closed
>Resolution: Out of Date
Priority: 5
Private: No
Submitted By: Nobody/Anonymous (nobody)
Assigned to: Nobody/Anonymous (nobody)
Summary: Interwiki sorting
Initial Comment:
There is some bug in interwiki sorting:
fi: is now sometimes added between laguages from S
http://en.wikinews.org/w/index.php?title=Category:February_1%2C_2008&diff=p…
and sk: is added at the end :-(
http://en.wikipedia.org/w/index.php?title=Category:Slovak_sportspeople&diff…
----------------------------------------------------------------------
>Comment By: Rotem Liss (rotemliss)
Date: 2008-05-30 09:20
Message:
Logged In: YES
user_id=1327030
Originator: NO
The "fi:" issue is not an issue - it is an intended behavior: See
http://meta.wikimedia.org/wiki/Interwiki_sorting_order . The "sk:" issue
seems to be fixed (I couldn't reproduce it, checking on wikipedia:en).
----------------------------------------------------------------------
Comment By: Melancholie (melancholie)
Date: 2008-05-30 01:08
Message:
Logged In: YES
user_id=2089773
Originator: NO
Note:
The sk: issue has already been fixed:
http://sourceforge.net/tracker/index.php?func=detail&aid=1977131&group_id=9…
Not sure if *fi:* issue is still there!?
----------------------------------------------------------------------
Comment By: Nobody/Anonymous (nobody)
Date: 2008-05-29 22:47
Message:
Logged In: NO
Additionally, I request pywiki check the doc of template because many
times interwikis and categories are there (in includeonly tag) and pywiki
put interwikis into the template again (inside or outside the noinclude
tag). I think, pywikipedia is unusable for temlates now.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1977728&group_…
Bugs item #1973804, was opened at 2008-05-27 02:53
Message generated for change (Comment added) made by nicdumz
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1973804&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
>Category: General
Group: None
>Status: Closed
>Resolution: Fixed
Priority: 8
Private: No
Submitted By: Melancholie (melancholie)
>Assigned to: NicDumZ — Nicolas Dumazet (nicdumz)
Summary: Huge memory consumption during changing process
Initial Comment:
As soon as the changing process (putting/saving of pages) is started, interwiki.py (r5440) consumes more than 100 MB of memory (RAM+Swap) if bot is working on many wikis. Memory usage grows during changing process. When changing process is finished, the memory suddenly gets flushed. Memory usage is normal again then, but only until the next 'putting-pages process' proceeds ;-)
----------------------------------------------------------------------
>Comment By: NicDumZ — Nicolas Dumazet (nicdumz)
Date: 2008-05-30 07:57
Message:
Logged In: YES
user_id=1963242
Originator: NO
Thanks to russblau and Bryan, this drawback should be highly reduced by
now - and I'm going to add, because previous versions caveats are well know
- if you run python 2.5 .
russblau made sure every Site objects were cached, avoiding to recreate a
new Site object everytime a new page is found. This should help a lot with
our current interwiki issue.
Bryan introduced the diskcache feature to save mediawiki messages on disk
to try to reduce RAM usage (set use_diskcache = True in user-config.py if
you need it)
I'm closing this bug since the overhead on put/save of pages for
interwiki.py is fixed. :)
----------------------------------------------------------------------
Comment By: NicDumZ — Nicolas Dumazet (nicdumz)
Date: 2008-05-29 13:24
Message:
Logged In: YES
user_id=1963242
Originator: NO
Okay, this has been partially fixed by r5461.
However, the fact that it is slow at _EACH_ put means that mediawiki
messages are retrieved at _EACH_ put. And since every Site object does not
ever retrieve its messages more than once, that might mean that the
creation of Site objects in interwiki.py is suboptimal.
A nice thing to check would be : Are we sure that only a single Site
object is created per site in an interwiki.py run ?
----------------------------------------------------------------------
Comment By: Melancholie (melancholie)
Date: 2008-05-29 09:09
Message:
Logged In: YES
user_id=2089773
Originator: YES
This bug is definitely because of that change:
http://svn.wikimedia.org/viewvc/pywikipedia/trunk/pywikipedia/wikipedia.py?…
----------------------------------------------------------------------
Comment By: Melancholie (melancholie)
Date: 2008-05-28 07:21
Message:
Logged In: YES
user_id=2089773
Originator: YES
On low memory systems that does even lead to:
Inconsistency detected by ld.so: dl-minimal.c: 84: __libc_memalign:
Assertion `page != ((void *) -1)' failed!
Does that have to do with BeautifulSoup.py?
The revision that used (c)ElementTree did not cause that kind of bug!
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1973804&group_…
Bugs item #1978787, was opened at 2008-05-30 06:43
Message generated for change (Comment added) made by nicdumz
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1978787&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
>Category: General
Group: None
>Status: Closed
>Resolution: Fixed
Priority: 5
Private: No
Submitted By: Melancholie (melancholie)
>Assigned to: NicDumZ — Nicolas Dumazet (nicdumz)
Summary: diskcache.py sometimes requests missing attribute 'SEEK_CUR'
Initial Comment:
At least using PCLinuxOS and SuSE; Python 2.5.2:
Updating links on page [[tr:Kuban Irmağı]].
Changes to be made: Ekleniyor: [[eo:Kuban]]
+ [[eo:Kuban]]
NOTE: Updating live wiki...
Changing page [[tr:Kuban Irmağı]]
Dump eo (wikipedia) saved
Traceback (most recent call last):
File "interwiki.py", line 1717, in ?
bot.run()
File "interwiki.py", line 1468, in run
self.queryStep()
File "interwiki.py", line 1447, in queryStep
subj.finish(self)
File "interwiki.py", line 1035, in finish
if self.replaceLinks(page, new, bot):
File "interwiki.py", line 1186, in replaceLinks
status, reason, data = page.put(newtext, comment = wikipedia.translate(page.site().lang, msg)[0] + mods)
File "/wikipedia.py", line 1270, in put
newPage, self.site().getToken(sysop = sysop), sysop = sysop)
File "/wikipedia.py", line 1370, in _putPage
if self.site().has_mediawiki_message("spamprotectiontitle")\
File "/wikipedia.py", line 4528, in has_mediawiki_message
v = self.mediawiki_message(key)
File "/wikipedia.py", line 4520, in mediawiki_message
if self._mediawiki_messages[key] is None:
File "/diskcache.py", line 99, in __getitem__
self.cache_file.seek(length, os.SEEK_CUR)
AttributeError: 'module' object has no attribute 'SEEK_CUR'
----------------------------------------------------------------------
>Comment By: NicDumZ — Nicolas Dumazet (nicdumz)
Date: 2008-05-30 07:50
Message:
Logged In: YES
user_id=1963242
Originator: NO
That seems to depend on the OS. I could not reproduce that behavior on my
debian...
However, I found a python-list thread about this behavior,
http://mail.python.org/pipermail/python-list/2006-March/375280.html and
added the fix in r5469
Thanks for your reports melancholie, they do help :)
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1978787&group_…
Revision: 5468
Author: nicdumz
Date: 2008-05-30 05:37:12 +0000 (Fri, 30 May 2008)
Log Message:
-----------
* Raising KeyError when some stupid users try to work with key '' (I did it :p )
* Doing only one disk access per mediawiki_message call; Catching diskusage KeyError on key access to raise a nice KeyError("MediaWiki key '%s' does not exist[...]) instead
Modified Paths:
--------------
trunk/pywikipedia/diskcache.py
trunk/pywikipedia/wikipedia.py
Modified: trunk/pywikipedia/diskcache.py
===================================================================
--- trunk/pywikipedia/diskcache.py 2008-05-29 23:20:27 UTC (rev 5467)
+++ trunk/pywikipedia/diskcache.py 2008-05-30 05:37:12 UTC (rev 5468)
@@ -62,7 +62,10 @@
if type(key) is unicode:
key = key.encode('utf-8')
- index = key[0]
+ try:
+ index = key[0]
+ except IndexError:
+ raise KeyError(key)
if not ((index >= 'a' and index <= 'z') or (index >= '0' and index <= '9')):
raise KeyError(key)
Modified: trunk/pywikipedia/wikipedia.py
===================================================================
--- trunk/pywikipedia/wikipedia.py 2008-05-29 23:20:27 UTC (rev 5467)
+++ trunk/pywikipedia/wikipedia.py 2008-05-30 05:37:12 UTC (rev 5468)
@@ -4517,10 +4517,12 @@
break
key = key.lower()
- if self._mediawiki_messages[key] is None:
+ try:
+ value = self._mediawiki_messages[key]
+ return value
+ except KeyError:
raise KeyError("MediaWiki key '%s' does not exist on %s"
% (key, self))
- return self._mediawiki_messages[key]
def has_mediawiki_message(self, key):
"""Return True iff this site defines a MediaWiki message for 'key'."""
Bugs item #1978787, was opened at 2008-05-30 06:43
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1978787&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: interwiki
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Melancholie (melancholie)
Assigned to: Nobody/Anonymous (nobody)
Summary: diskcache.py sometimes requests missing attribute 'SEEK_CUR'
Initial Comment:
At least using PCLinuxOS and SuSE; Python 2.5.2:
Updating links on page [[tr:Kuban Irmağı]].
Changes to be made: Ekleniyor: [[eo:Kuban]]
+ [[eo:Kuban]]
NOTE: Updating live wiki...
Changing page [[tr:Kuban Irmağı]]
Dump eo (wikipedia) saved
Traceback (most recent call last):
File "interwiki.py", line 1717, in ?
bot.run()
File "interwiki.py", line 1468, in run
self.queryStep()
File "interwiki.py", line 1447, in queryStep
subj.finish(self)
File "interwiki.py", line 1035, in finish
if self.replaceLinks(page, new, bot):
File "interwiki.py", line 1186, in replaceLinks
status, reason, data = page.put(newtext, comment = wikipedia.translate(page.site().lang, msg)[0] + mods)
File "/wikipedia.py", line 1270, in put
newPage, self.site().getToken(sysop = sysop), sysop = sysop)
File "/wikipedia.py", line 1370, in _putPage
if self.site().has_mediawiki_message("spamprotectiontitle")\
File "/wikipedia.py", line 4528, in has_mediawiki_message
v = self.mediawiki_message(key)
File "/wikipedia.py", line 4520, in mediawiki_message
if self._mediawiki_messages[key] is None:
File "/diskcache.py", line 99, in __getitem__
self.cache_file.seek(length, os.SEEK_CUR)
AttributeError: 'module' object has no attribute 'SEEK_CUR'
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1978787&group_…
Bugs item #1978444, was opened at 2008-05-30 00:13
Message generated for change (Comment added) made by misza13
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1978444&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: interwiki
Group: None
>Status: Closed
>Resolution: Fixed
Priority: 5
Private: No
Submitted By: Melancholie (melancholie)
Assigned to: Nobody/Anonymous (nobody)
Summary: Add Ślůnski interwiki.py message
Initial Comment:
Please add the line
'szl': (u'Bot ', u'dodowo', u'wyćepuje', u'zmjyńo'),
to the 'msg' array in interwiki.py.
See:
http://szl.wikipedia.org/wiki/Wikipedyjo:Boty#Interwiki_bots_-_changes_desc…
----------------------------------------------------------------------
>Comment By: Misza13 (misza13)
Date: 2008-05-30 01:21
Message:
Logged In: YES
user_id=1686644
Originator: NO
Added in r5467.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1978444&group_…