Hi folks!
I'm trying to setup labmon1002 as a cold standby for labmon1001.
We need to sync the whisper files from one server to another, so in case
we lost labmon1001 we don't lost all metrics.
Regarding hiera, in my mind it was as simpler as having 2 hiera keys
(names aren't set in stone):
* wmcs::monitoring::server labmon1001.eqiad.wmnet
* wmcs::monitoring::server_standby labmon1002.eqiad.wmnet
And then:
* have all clients send data to 'wmcs::monitoring::server'
* In case of outage, simple flip the keys
* the rsync cronjob is in server 'wmcs::monitoring::server_standby'
If you grep the ops/puppet.git repo, you may find *a lot* of calls
to 'labmon1001.eqiad.wmnet'. Examples:
* hieradata/common/profile/openstack/labtest.yaml
profile::openstack::labtest::statsd_host: 'labmon1001.eqiad.wmnet'
* hieradata/common/profile/openstack/main.yaml
profile::openstack::main::statsd_host: 'labmon1001.eqiad.wmnet'
* hieradata/labs/deployment-prep/common.yaml
service::configuration::statsd_host: labmon1001.eqiad.wmnet
* hieradata/labs/deployment-prep/common.yaml
graphite_host: labmon1001.eqiad.wmnet
To improve a bit maintainability, I thought of using a single hiera key,
the toplevel 'wmcs::monitoring::server', so in case of an outage, we
don't have to update a lot of LOCs to point to the standby server.
This is, some kind of code factorization.
Hiera is a new thing to me, and I've been doing some testing, test
compilations and playing with tools/hiera_lookup [0].
And at the end, this doesn't seem to work because:
* my new hiera keys are not found (why hieradata/labs.yaml is never read?)
* some other weirdness unknown to me
* isn't there a way to introduce a global hiera key for all our environment?
So, would you please share some hints? What do you think about this
whole picture? Do you have any suggestion for the hiera keys layout?
Thanks in advance for your time! :-)
Relevant phabricator tasks:
* labmon1002 as cold standby for labmon1001
**
https://phabricator.wikimedia.org/T189871
* labmon: syncronize whisper files between labmon1001 and labmon1002
**
https://phabricator.wikimedia.org/T190512
[0] cmdline used are things like:
% utils/hiera_lookup --fqdn=labmon1002.eqiad.wmnet
--roles=labs::monitoring profile::labs::monitoring::master -v
% utils/hiera_lookup --fqdn=labmon1002.eqiad.wmnet
profile::labs::monitoring::master -v