Hi Marko,
I looked at the document and I would have multiple patches to it; alas, I don't have time to go through it with the needed attention right now because (as anticipated before the quarter) I have zero time to spend on this.
Ops have some experience in running a cluster orchestration system in production (for toollabs) and I have thought about it for quite some time now; I have some ideas on how things should be done to have a decent, manageable "elastic" environment with advantages for developers; I would love to integrate your document with ideas/a more general vision about production; this is probably not going to happen for at least one month though.
Can we hold on before we declare this document to be "definitive"?
Also, can we stop calling it a "container-based" infrastructure? :) I seriously think containers are little more than an implementation detail of the general vision.
Cheers,
Giuseppe
On Wed, Feb 15, 2017 at 11:28 PM, Marko Obrovac mobrovac@wikimedia.org wrote:
Hello,
In light of the upcoming annual planning for the joint technology goal of having a shared container-based infrastructure, the Services team has started collecting requirements for the platform in terms of development, testing and operation of services (together with some other considerations like automation and configuration management)~[1]. Please take a look at the document and add/remove/improve/suggest as you see fit. Note that the document is to be considered only a draft at this point.
Cheers, Marko
[1] https://docs.google.com/a/wikimedia.org/document/d/ 1QsCVooqxkeE6tKYTxgoRvRdK2M3tDk4UyvmnHJrdag4/edit?usp=sharing
-- Marko Obrovac, PhD Senior Services Engineer Wikimedia Foundation
Ops mailing list Ops@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/ops
Giuseppe Lavagetto, 16/02/2017 09:39:
Can we hold on before we declare this document to be "definitive"?
Publishing the draft on a wiki would also help.
[1] https://docs.google.com/a/wikimedia.org/document/d/1QsCVooqxkeE6tKYTxgoRvRdK2M3tDk4UyvmnHJrdag4/edit?usp=sharing
The link is private. Is this intended?
Nemo
Hello Giu,
On 16 February 2017 at 02:39, Giuseppe Lavagetto glavagetto@wikimedia.org wrote:
Hi Marko,
I looked at the document and I would have multiple patches to it; alas, I don't have time to go through it with the needed attention right now because (as anticipated before the quarter) I have zero time to spend on this.
Ops have some experience in running a cluster orchestration system in production (for toollabs) and I have thought about it for quite some time now; I have some ideas on how things should be done to have a decent, manageable "elastic" environment with advantages for developers; I would love to integrate your document with ideas/a more general vision about production; this is probably not going to happen for at least one month though.
This is an initial draft that I have shared so as to solicit wider input from Ops and RelEng and to keep everybody on the same page as we all look at the same thing from a different angle and have different ideas about it. As you might have noticed, the list is not really detailed and it's pretty high-level. You are making a good point regarding elasticity, and I think this is something we should discuss properly.
Can we hold on before we declare this document to be "definitive"?
One of our Q3 goals is to come up with a list of requirements for this brave-new world of dynamicity, so we have some time to keep the discussion going.
Also, can we stop calling it a "container-based" infrastructure? :) I seriously think containers are little more than an implementation detail of the general vision.
"Fancy new stuff that will cure all of our problems" infra? I don't really have a preference on the name as long as we all agree on the vision for it.
Cheers, Marko
Cheers,
Giuseppe
On Wed, Feb 15, 2017 at 11:28 PM, Marko Obrovac mobrovac@wikimedia.org wrote:
Hello,
In light of the upcoming annual planning for the joint technology goal of having a shared container-based infrastructure, the Services team has started collecting requirements for the platform in terms of development, testing and operation of services (together with some other considerations like automation and configuration management)~[1]. Please take a look at the document and add/remove/improve/suggest as you see fit. Note that the document is to be considered only a draft at this point.
Cheers, Marko
[1] https://docs.google.com/a/wikimedia.org/document/d/1QsCV ooqxkeE6tKYTxgoRvRdK2M3tDk4UyvmnHJrdag4/edit?usp=sharing
-- Marko Obrovac, PhD Senior Services Engineer Wikimedia Foundation
Ops mailing list Ops@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/ops
-- Giuseppe Lavagetto, Ph.d. Senior Technical Operations Engineer, Wikimedia Foundation
Marko et al,
First of, thanks for working on all that and soliciting our input :)
On Fri, Feb 17, 2017 at 06:35:58PM -0600, Marko Obrovac wrote:
On 16 February 2017 at 02:39, Giuseppe Lavagetto glavagetto@wikimedia.org wrote:
Can we hold on before we declare this document to be "definitive"?
One of our Q3 goals is to come up with a list of requirements for this brave-new world of dynamicity, so we have some time to keep the discussion going.
Unfortunately, as we explained pretty early during quarterly planning, we have no time to work on this not just now, but throughout this quarter (with Giuseppe working hard on the switchover quarterly goal, Alex on parental leave etc.). I don't anticipate finding the time to "keep the discussion going" much until your deadline and thus I don't believe we can end up with something definitive until then.
Whatever time we are spending on this, we are spending it for FY2017-2018's annual planning purposes, which I am afraid is a little more high-level than your document and is well underway.
Looking forward to working with you on this in the future :)
Best, Faidon
On 17-02-21 15:37:51, Faidon Liambotis wrote:
On Fri, Feb 17, 2017 at 06:35:58PM -0600, Marko Obrovac wrote:
On 16 February 2017 at 02:39, Giuseppe Lavagetto glavagetto@wikimedia.org wrote:
Can we hold on before we declare this document to be "definitive"?
One of our Q3 goals is to come up with a list of requirements for this brave-new world of dynamicity, so we have some time to keep the discussion going.
Unfortunately, as we explained pretty early during quarterly planning, we have no time to work on this not just now, but throughout this quarter (with Giuseppe working hard on the switchover quarterly goal, Alex on parental leave etc.). I don't anticipate finding the time to "keep the discussion going" much until your deadline and thus I don't believe we can end up with something definitive until then.
I think services and releng this quarter are both grappling with the integration of our respective responsibilities into a deployment pipeline and modern infrastructure.
For releng, the incipient deployment pipeline began to take shape after an inventory of the shortcomings of our CI and beta clusters. The shortcomings we identified are, I believe, intractable without a coordinated effort from development to production. This is plainly not possible without input from many teams.
I understand that ops has no explicit time for this during the current quarter. As such, I appreciate the responsiveness to this discussion.
Speaking for releng, nothing will be *final* by the end of this quarter. We are refining the needs of development, CI, and deployment.
My (possibly erroneous) feeling is that ops has a better handle on their requirements for a modernized infrastructure than the releng team. This is why it is necessary for the requirements gathering and discussion to begin early for us, and why we made it an explicit goal for this quarter [0].
Looking forward to working with you on this in the future :)
Likewise :)
Thanks!
-- Tyler
[0]. https://www.mediawiki.org/wiki/Wikimedia_Release_Engineering_Team/Goals/201617Q3