On Fri, Oct 18, 2019 at 2:25 PM Bryan Davis bd808@wikimedia.org wrote:
- It seems reasonable that DNS records for *.wmcloud.org should all
relate to publicly routable addresses. Does it also seem reasonable that *.wikimedia.cloud DNS records should all relate to non-publicly routable addresses? If an address from a publically routable IPv4 space is being used only to provide a service IP for a service that is 100% internal to Cloud VPS (like the example of the HAProxy cluster fronting a Kubernetes cluster inside the tools project) is it acceptable to create an A record in the *.wikimedia.cloud zone for that address?
yeah, that all sounds good to me.
- Should we have a floating IP pool of non-publicly routable IPv4
addresses for use cases where the service that is being provided is only intended to be internal to a single project or the Cloud VPS tenant network? Routable IPv4 addresses are a limited commodity, and currently Cloud VPS has a very small number of them available.
One of the reasons that we are spending time discussing these things is that we hope to decide on a set of standards and practices which will make it easier to reason about use and maintenance of Cloud VPS. Towards that end, I think I would propose these answers to the new questions:
- The *.wikimedia.cloud zone should only contain A records pointing to
non-publicly routable IPv4 records. My reasoning for this is that it makes it easier to quickly think about the general threat model for a FQDN in this zone. If the FQDN ends in wikimedia.cloud, then the IP associated with that FQDN is not publically routable.
- We should create a pool of floating ips using non-publicly routable
IPv4 addresses for the explicit purpose of being used as service IPs for HA/LB systems within the Cloud VPS internal network. These service IPs would then be given FQDNs in the *.wikimedia.cloud zone without breaking the rule that all FQDNs ending in wikimedia.cloud are not publically routable.
· We may have to adjust a lot of security groups if the new floating pool of private IPS is outside of the 172.16.0.0/21 subnet. If we carve it out as a subset of that CIDR then I *think* it requires careful changes to the existing "cloud-instances2-b-eqiad" subnet to cordon off a /24 or /25 block into a new neutron subnet that would be the source of the new pool and not part of the subnet that nova/neutron use to give out fixed IPs to instances. Hopefully Arturo or Jason can reason about the work and impact of this better than I can.
I wouldn't create another floating IP pool or make any changes to `cloud-instances2-b-eqiad` `172.16.0.0/21`. Reconfiguring this as an external network (floating IP requirement) would bring a lot of complexity with host routing, NAT rules, subnets and security groups.
Instead of using floating IPs for non-publicly routed subnets, I'd pre-allocate an IP address from that subnet, configure the front-end load balancer host's neutron ports with allowed address pairs, and use VRRP + keepalived to manage which host has the active virtual IP (VIP).
I can put together a simple prototype of this if there's any interest going down that path.
Regards, Jason