All the documentation I could find is in docs/deferred.txt. Let me paste the paragraph:
"A few of the database updates required by various functions here can be deferred until after the result page is displayed to the user. For example, updating the view counts, updating the linked-to tables after a save, etc. PHP does not yet have any way to tell the server to actually return and disconnect while still running these updates (as a Java servelet could), but it might have such a feature in the future."
That text has been there at least since 2005. Given that to my knowledge there still is no such feature: I've spent hours trying to investigate why DeferrableUpdates delayed the page delivery as I incorrectly assumed those would be run after page has been delivered and trying to figure out if it is possible to make them actually work that way with PHP-FPM and nginx.
Should we just get rid of them? That should be easy, by either moving stuff to the jobqueue or just executing the code immediately.
Or if they are useful for something, can we at least document the *class* to reflect how it actually works and what it is useful for?
-Niklas
Yeah this is some ancient stuff... If it's actually ok to defer them we should be using the job queue.
And the job queue should.... be completely redone so it's not awful, if we haven't started on that already. :)
-- brion
On Thu, Sep 12, 2013 at 8:51 AM, Niklas Laxström niklas.laxstrom@gmail.comwrote:
All the documentation I could find is in docs/deferred.txt. Let me paste the paragraph:
"A few of the database updates required by various functions here can be deferred until after the result page is displayed to the user. For example, updating the view counts, updating the linked-to tables after a save, etc. PHP does not yet have any way to tell the server to actually return and disconnect while still running these updates (as a Java servelet could), but it might have such a feature in the future."
That text has been there at least since 2005. Given that to my knowledge there still is no such feature: I've spent hours trying to investigate why DeferrableUpdates delayed the page delivery as I incorrectly assumed those would be run after page has been delivered and trying to figure out if it is possible to make them actually work that way with PHP-FPM and nginx.
Should we just get rid of them? That should be easy, by either moving stuff to the jobqueue or just executing the code immediately.
Or if they are useful for something, can we at least document the *class* to reflect how it actually works and what it is useful for?
-Niklas
-- Niklas Laxström
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Yeah DeferredUpdates pretty much suck. I was only using them because SearchUpdate already did. I should just bite the bullet and redo those as jobs.
Luckily they're not too widely used outside of core so we should be able to do this without a ton of pain.
-Chad
On Thu, Sep 12, 2013 at 8:53 AM, Brion Vibber bvibber@wikimedia.org wrote:
Yeah this is some ancient stuff... If it's actually ok to defer them we should be using the job queue.
And the job queue should.... be completely redone so it's not awful, if we haven't started on that already. :)
-- brion
On Thu, Sep 12, 2013 at 8:51 AM, Niklas Laxström niklas.laxstrom@gmail.comwrote:
All the documentation I could find is in docs/deferred.txt. Let me paste the paragraph:
"A few of the database updates required by various functions here can be deferred until after the result page is displayed to the user. For example, updating the view counts, updating the linked-to tables after a save,
etc.
PHP does not yet have any way to tell the server to actually return and disconnect while still running these updates (as a Java servelet could), but it
might
have such a feature in the future."
That text has been there at least since 2005. Given that to my knowledge there still is no such feature: I've spent hours trying to investigate why DeferrableUpdates delayed the page delivery as I incorrectly assumed those would be run after page has been delivered and trying to figure out if it is possible to make them actually work that way with PHP-FPM and nginx.
Should we just get rid of them? That should be easy, by either moving stuff to the jobqueue or just executing the code immediately.
Or if they are useful for something, can we at least document the *class* to reflect how it actually works and what it is useful for?
-Niklas
-- Niklas Laxström
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Le 12/09/13 08:53, Brion Vibber a écrit :
And the job queue should.... be completely redone so it's not awful, if we haven't started on that already. :)
I would love us to be able to rely on a job scheduling system such as Gearman (http://gearman.org/), there are most probably other around.
The Jenkins/Gerrit gateway (Zuul), has been enhanced by upstream to use Gearman instead of talking with Jenkins API. Zuul would thus send job requests to Gearman and forget about it until it is given the result. You scale out your jobs by adding more workers to Gearman bus.
Speaking of the job queue, deferred updates are useful for adding jobs that depend on data that was not yet committed. This can easily be an issue since we normally wrap web requests in one DB transaction and commit at the very end. If you push() some jobs before the commit, and they get run before commit (which might randomly happen from time to time), and they depend on some of those DB changes, then the jobs might break. Using deferred updates works around this, as do the transaction callback methods in the Database classes (if you know exactly what DBs things depend on).
-- View this message in context: http://wikimedia.7.x6.nabble.com/What-are-DeferredUpdates-good-for-tp5013179... Sent from the Wikipedia Developers mailing list archive at Nabble.com.
On Mon, Sep 16, 2013 at 11:32 AM, Aaron Schulz aschulz4587@gmail.comwrote:
Speaking of the job queue, deferred updates are useful for adding jobs that depend on data that was not yet committed. This can easily be an issue since we normally wrap web requests in one DB transaction and commit at the very end. If you push() some jobs before the commit, and they get run before commit (which might randomly happen from time to time), and they depend on some of those DB changes, then the jobs might break. Using deferred updates works around this, as do the transaction callback methods in the Database classes (if you know exactly what DBs things depend on).
Could be worked around if we added a "don't do this job until" functionality to the queue.
-Chad
Until what? A timestamp? That would be more complex and prone to over/under guessing the right delay (you don't know how long it will take to commit). I think deferred updates are much simpler as they will just happen when the request is nearly done, however long that takes.
-- View this message in context: http://wikimedia.7.x6.nabble.com/What-are-DeferredUpdates-good-for-tp5013179... Sent from the Wikipedia Developers mailing list archive at Nabble.com.
wikitech-l@lists.wikimedia.org