https://bugzilla.wikimedia.org/show_bug.cgi?id=55889
--- Comment #4 from Strainu crangasi2001@yahoo.com --- (In reply to comment #3)
The next layer, comms.threadedhttp, supports asynchronous requests. [...] I don't think we use this feature anywhere, as it's not exposed in the higher-up layers.
I've noticed that while writing the answer to Gerard's questions today :)
For saving pages, which (I think) is the most relevant place for async request, we already have support, where requests that do not return a reply that has to be handled can be handled asynchronously - see Page.put_async.
I've experimented with put_async with mixed results. When the upload works, it's mostly OK, however when one request hits an error (like a 504 from the server) it just keeps trying again and again, keeping the thread blocked.
Instead, the request should probably be de-queued, processed and, if a callback has been registered, the callback should be called in order to allow the bot to re-queue the request. This, however, could cause trouble if the order of the requests is important. The bot can receive a callback, but AFAIK it cannot remove already queued requests. Also, what happens if no callback has been registered? Should we simply re-queue the request? I don't have a perfect solution at this time, but this is a point that should be considered.
Another possible issue, that PWB can't really do much about, is that one can get a 504 even if the save is successful, making the re-queueing useless. I don't have a good solution for that either, but we could consult with the Wikimedia developers.
For pagegenerators, we might be able to win a bit by requesting the (i+1)th page before returning the i-th page (or, for the PreloadingGenerator, by requesting the (i+1)th batch before all pages from the i-th batch have been returned).
This should be especially useful if it can be controlled by the user. Do you have any ideas on how to do this?
I think there were some good ideas brought up on this bug. Should we start a thread on the mailing list so we can gather more input on this?