<div dir="ltr">On Tue, Feb 4, 2014 at 12:14 AM, Petr Bena <span dir="ltr"><<a href="mailto:benapetr@gmail.com" target="_blank">benapetr@gmail.com</a>></span> wrote:<br><div class="gmail_extra"><div class="gmail_quote">
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">I think that Ryan said something like he would most happily get rid of<br>
puppet or replace it with a better solution :P but if you really want<br>
to keep stuff managed by puppet, I still see an issue with other<br>
projects which aren't using puppet, or which do use different<br>
puppetmaster.<br>
<br></blockquote><div><br></div><div>I didn't say that. I said if you're starting from scratch you should consider something other than puppet. That wasn't about Labs or Wikimedia at all.<br></div><div> </div>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
To be honest, from my point of view, puppet as it is now on labs is<br>
almost unusable for non-ops users. Getting any simple change merged<br>
unless it's top priority thing requires someone from ops, and usually<br>
take at least few hours if not days. I can't imagine any sysadmin who<br>
can work like this, some changes need to be applied immediately, you<br>
can't wait for them to happen for days, so I expect that waste<br>
majority of projects that exist now will not use puppet anyway (you<br>
just can't force people to use it under these circumstances), so they<br>
wouldn't benefit from this.<br>
<br></blockquote><div><br></div><div>You shouldn't be making changes to systems without code review. Wikimedia Ops generally has a bad practice in this regard (self-merging). It's mostly historical. Other places I've worked at or consulted with *require* code review to merge.<br>
<br></div><div>So you know, I work like this (and I'm pretty reasonably productive, from most people's perspective).<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
That is why I think that even if we are to use this puppet nrpe<br>
management there still should be a way for manual adjustments and not<br>
just because of these projects, but also to fix other icinga issues.<br>
For example right now it receive some nonsense (broken) data from ldap<br>
about instances that don't even exist anymore. If there wasn't that<br>
nasty workaround consisting of instance ignore list, that prevents<br>
these hosts from being monitored, icinga would be full of hosts that<br>
are down. How would you apply i_dont_exist puppet class to nonexisting<br>
node? :P<br>
<br></blockquote><div><br></div><div>Did you put a bug in about the broken data?<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
I have nothing against "labs cloning production" beside that IMHO it<br>
should be the other way (production should actually clone labs, which<br>
is the testing env where changes should happen first before they get<br>
deployed on production), but still labs != production so I think we<br>
could have some extra thing here that would make it easier to manage<br>
icinga for regular, non-ops people which would exist on labs only and<br>
not on production.<br>
<div class="HOEnZb"><div class="h5"><br></div></div></blockquote><div><br></div><div>The biggest reason we can't do the same thing in labs and production for nagios is that in production nagios is generated via exported resources, which are disabled in labs.<br>
<br></div><div>As far as I know that and ssh host keys are the only things in Wikimedia's puppet that requires exported resources.<br><br></div><div>- Ryan<br></div></div></div></div>