Re: [discovery] First two weeks as part of the team

18 Feb 2016

      [X-posting to ops as this discussion is relevant there too]
On Wed, Feb 17, 2016 at 5:53 PM, Erik Bernhardson
ebernhardson@wikimedia.org wrote:
...
On Feb 17, 2016 1:50 AM, "Guillaume Lederrey" glederrey@wikimedia.org
wrote:
...
Hello team!
== Versionning ==
**my belief **
anything deployed must have a version number
** what happens at WMF **

deployments on labs are pretty much free-form, cherry pick whatever

you want on puppetmaster

deployments on prod seems to have version numbers at least for

mediawiki code, puppet code is deployed directly from production
branch
** comments **
Having clear version numbers implies having a conscious decision of
creating a version, potentially with the appropriate checks of the
content of that version, additional testing. It allows to have a clear
separation between creating a version and promoting it to production.
Not having versions everywhere allows for more flexibility and puts
responsibility of making the right choices more on the people than on
the process. Probably a good thing if you have smart enough people
(and WMF seems to have a pretty smart crowd).
Having a shared git repository on deployment-puppetmaster scares the
hell out of me! I'm so used to preparing anything I want to push
locally and then just applying a specific tag / version...
Puppet being unversioned certainly makes it different from the rest of
deployments. I think ops gets away with this by having relatively few people
commiting code. It also has to do with the careful nature of puppet
deployments,  puppet is  typically deployed one patch at a time. I think
this helps with understanding what just broke everything, rather than having
a big release with many disparate changes.
Puppet is _always_ deployed one patch at a time unless for very very
special cases; I do think it's a very good thing for operations: there
are a few reasons why it's a good thing:
1) Minimize change risk/surface: given we're a very high traffic
website with a mildly complex architecture, you can't realistically
think you can validate a large set of changes without throwing live
traffic at them. I've see ops teams working with stricter change
management strategies and the risk for *big troubles* has always been
higher.
2) Speed of deployment: we're a very small team for the amount of
things we're doing in parallel. We can't seriously think to keep up
the pace with a stricter change management (as in, deploy a new
version of our puppet code N times a week after rigorous testing and
picking the changes that make the cut).
3) Keeping changes independent: since the puppet repo is large and
includes all of production, having changes to independent systems
being tied together is a recipe for disaster: rolling back one change
would mean rolling back all of them, frustrating a lot of people and
probably requiring coordination with other teams. You could just
revert the affected change and make a new point release, but then I
miss completely how having releases does us any good.
About cherry-picks in beta: the problem is not cherry-picking (I think
it's a reasonable way to test things) but persistent cherry-picking to
monkey patch problems is. I think if we follow the flow of:
- writing a patch
- testing it on beta with a cherry-pick
- get it merged on ops/puppet and production
and all of this happens within a week, it would be a decent compromise.
...
...

I still have not found a global architecture schema (something like

a high level component or deplyoment diagram). But I have never seen
any company having those...
Pretty sure one doesn't exist :(
Luca (the new analytics opsen) has started to work on
https://wikitech.wikimedia.org/wiki/File:Infrastructure_overview.png
I asked him to share the sources for it so that everyone can improve it.
Also, if you need some oral history, just ask opsens and we'll be
happy to give you an overview of how things work :)
Cheers,
Giuseppe

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

Re: [discovery] First two weeks as part of the team