Hi Binaris
I did not write the any of the threaded stuff in wikipedia.py but I
have used it a couple of times. I think what you should do is provide
a callable _object_ and not a callback function. You can then iterate
through the list of callback objects and look at the errors if there
are any. Here is a sample program I wrote to illustrate the concept:
import wikipedia as pywikibot
from time import sleep
pages = [
'User:HRoestBot/CallbackTest1',
'User:HRoestBot/CallbackTest2',
]
class CallbackObject(object):
def __init__(self):
self.done = False
def __call__(self, page, error):
self.page = page
self.error = error
self.done = True
Callbacks = []
for mypage in pages:
print(mypage);
callb = CallbackObject()
page = pywikibot.Page(pywikibot.getSite(), mypage)
Callbacks.append(callb)
page.put_async('some text', callback=callb)
# Waiting until all pages are saved on Wikipedia
while True:
if all( [c.done for c in Callbacks] ): break
print "Still Waiting"
sleep(5)
# Now we can look at the errors
for obj in Callbacks:
print obj.page, obj.error
if not obj.error is None:
# do something to handle errors
The output of such a program may then be
$ python test.py
unicode test: triggers problem #3081100
HRoestBot/CallbackTest1
HRoestBot/CallbackTest2
Sleeping for 4.0 seconds, 2012-02-24 09:32:57
Still Waiting
Still Waiting
Updating page [[HRoestBot/CallbackTest1]] via API
Still Waiting
Sleeping for 19.3 seconds, 2012-02-24 09:33:18
Still Waiting
Updating page [[HRoestBot/CallbackTest2]] via API
Still Waiting
[[de:HRoestBot/CallbackTest1]] An edit conflict has occured.
[[de:HRoestBot/CallbackTest2]] An edit conflict has occured.
hr@hr:~/projects/private/pywikipedia_gitsvn$
At least that is how I do it. I hope that helps to understand. You can
also use pywikibot.page_put_queue.qsize() and
pywikibot.page_put_queue.empty() to check whether the queue is empty
or not but this might still lead to problems because the page is
fetched from the queue and *then* page.put is called on it. So until
page.put() finishes, the queue will be empty even though the bot is
still putting the page. See the function def async_put(), it seems to
me much safer to rely on the Callback objects to be sure that all the
put-calls are done.
You can also look at _flush() method in wikipedia.py to see how it
determines whether all pages are put and its save to exit or not.
Hannes
On 23 February 2012 21:30, Bináris <wikiposta(a)gmail.com> wrote:
I made a big effort to understand this stuff with
put_async and threading,
but here is a point I can't get over.
I read a lot and understood that means of waiting for a thread is join, and
in wikipedia.py, join must be a method of _putthread which is the Thread
object. Noe, wherever I write the line
_putthread.join()
(I tried put_async, async_put and even replace.py which I know is not a good
solution) it freezes my command window as if the thread never terminated.
_putthread.join(time) waits for the given time, but this is not appropriate,
just for test.
Does any script really use this callback at all? Which one?
Line 8054 in wikipedia.py says "an explicit end-of-Queue marker is needed",
and this is supposed to be a call for async_put with None as value of page.
But I don't see anywhere this dummy call for async_put. May that be a bug or
I just misunderstand?
(Btw, Python 2.4 should be forgotten.)
Please, I need your help.
--
Bináris
_______________________________________________
Pywikipedia-l mailing list
Pywikipedia-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l