In an effort to reduce surprises and potential mishaps it is now required to include any long running tasks in the deployment calendar[0].
"Long running tasks" include any script that is run on production 'work machines' such as terbium that last for longer than ~1 hour. Think: migration and maintenance scripts.
This was discussed and proposed in T144661[1].
Best,
Greg
[0] https://wikitech.wikimedia.org/wiki/Deployments Relevant diff: https://wikitech.wikimedia.org/w/index.php?diff=850923&oldid=850244 [1] https://phabricator.wikimedia.org/T144661
<quote name="Greg Grossmeier" date="2016-09-20" time="15:29:30 -0700">
In an effort to reduce surprises and potential mishaps it is now required to include any long running tasks in the deployment calendar[0].
To clarify: This does *not* mean that no other deploys can happen at the same time. Other deploys *can* happen (as they did before), but this provides a method for those running the long running tasks to communicate to deployers and vice-versa.
Deployers/those running the long running task are encouraged to use the "Changes" section of the deployment calendar to communicate any gotchas or other information valuable to other deploys/opsen.
Of course, if the long running task does imply no other deploys (not terribly common) then that would be coordinated appropriately.
Greg
Hello!
Increasing visibility sounds like a great idea! How far do we want to go in that direction? In particular, I'm thinking of a few of the crons we have for Cirrus. For example, we do have daily crons on terbium that re-generate the suggester indices. Those can run for > 1h.
My understanding is that those kind of crons should not be considered scripts, but standard working parts of the system. Adding them will probably generate more noise than useful information. Is this a reasonable understanding?
Thanks!
Guillaume
On Wed, Sep 21, 2016 at 12:29 AM, Greg Grossmeier greg@wikimedia.org wrote:
In an effort to reduce surprises and potential mishaps it is now required to include any long running tasks in the deployment calendar[0].
"Long running tasks" include any script that is run on production 'work machines' such as terbium that last for longer than ~1 hour. Think: migration and maintenance scripts.
This was discussed and proposed in T144661[1].
Best,
Greg
[0] https://wikitech.wikimedia.org/wiki/Deployments Relevant diff: https://wikitech.wikimedia.org/w/index.php?diff=850923&oldid=850244 [1] https://phabricator.wikimedia.org/T144661
-- | Greg Grossmeier GPG: B2FA 27B1 F7EB D327 6B8E | | Release Team Manager A18D 1138 8E47 FAC8 1C7D |
Engineering mailing list Engineering@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/engineering
wikitech-l@lists.wikimedia.org