<div dir="ltr">On Tue, Feb 4, 2014 at 12:14 AM, Petr Bena <span dir="ltr"><<a href="mailto:benapetr@gmail.com" target="_blank">benapetr@gmail.com</a>></span> wrote:<br><div class="gmail_extra"><div class="gmail_quote">


<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">I think that Ryan said something like he would most happily get rid of<br>

puppet or replace it with a better solution :P but if you really want<br>

to keep stuff managed by puppet, I still see an issue with other<br>

projects which aren't using puppet, or which do use different<br>

puppetmaster.<br>

<br></blockquote><div><br></div><div>I didn't say that. I said if you're starting from scratch you should consider something other than puppet. That wasn't about Labs or Wikimedia at all.<br></div><div> </div>


<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

To be honest, from my point of view, puppet as it is now on labs is<br>

almost unusable for non-ops users. Getting any simple change merged<br>

unless it's top priority thing requires someone from ops, and usually<br>

take at least few hours if not days. I can't imagine any sysadmin who<br>

can work like this, some changes need to be applied immediately, you<br>

can't wait for them to happen for days, so I expect that waste<br>

majority of projects that exist now will not use puppet anyway (you<br>

just can't force people to use it under these circumstances), so they<br>

wouldn't benefit from this.<br>

<br></blockquote><div><br></div><div>You shouldn't be making changes to systems without code review. Wikimedia Ops generally has a bad practice in this regard (self-merging). It's mostly historical. Other places I've worked at or consulted with *require* code review to merge.<br>


<br></div><div>So you know, I work like this (and I'm pretty reasonably productive, from most people's perspective).<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">


That is why I think that even if we are to use this puppet nrpe<br>

management there still should be a way for manual adjustments and not<br>

just because of these projects, but also to fix other icinga issues.<br>

For example right now it receive some nonsense (broken) data from ldap<br>

about instances that don't even exist anymore. If there wasn't that<br>

nasty workaround consisting of instance ignore list, that prevents<br>

these hosts from being monitored, icinga would be full of hosts that<br>

are down. How would you apply i_dont_exist puppet class to nonexisting<br>

node? :P<br>

<br></blockquote><div><br></div><div>Did you put a bug in about the broken data?<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

I have nothing against "labs cloning production" beside that IMHO it<br>

should be the other way (production should actually clone labs, which<br>

is the testing env where changes should happen first before they get<br>

deployed on production), but still labs != production so I think we<br>

could have some extra thing here that would make it easier to manage<br>

icinga for regular, non-ops people which would exist on labs only and<br>

not on production.<br>

<div class="HOEnZb"><div class="h5"><br></div></div></blockquote><div><br></div><div>The biggest reason we can't do the same thing in labs and production for nagios is that in production nagios is generated via exported resources, which are disabled in labs.<br>


<br></div><div>As far as I know that and ssh host keys are the only things in Wikimedia's puppet that requires exported resources.<br><br></div><div>- Ryan<br></div></div></div></div>