Hello everyone,
TL;DR if you are not deploying services to the codfw wikikube kubernetes cluster, you can safely skip this.
Long version:
We will reinitialize the codfw wikikube kubernetes cluster with kubernetes version 1.23 on 2023-02-21 09:00-16:00 UTC [1] (the actual process is expected take a couple of hours within this window). The date was chosen for convenience as we will have depooled all active/active services from codfw for row B switch maintenance [2] anyways. As all traffic will be drained beforehand we expect no user visible impact. However, for the duration of the process, the kubernetes cluster will be unavailable to deployers and thus efforts to deploy to it will fail or worse, not have the expected outcomes. This is normal until SRE serviceops announces that the cluster is fully operational again.
SRE serviceops will be deploying all services before marking the cluster as usable and pooling traffic back to it, so there will be no need for deployers to re-deploy their services (apart from those already informed).
[1] https://phabricator.wikimedia.org/T329664 [2] https://phabricator.wikimedia.org/T327991
Regars, Janis Meybohm
Hello everyone,
The cluster was successfully re-initialized today, all services have been re-pooled and are in service. The cluster is fully operational again and can be used by deployers.
Regards,
On Wed, Feb 15, 2023 at 1:41 PM Janis Meybohm jmeybohm@wikimedia.org wrote:
Hello everyone,
TL;DR if you are not deploying services to the codfw wikikube kubernetes cluster, you can safely skip this.
Long version:
We will reinitialize the codfw wikikube kubernetes cluster with kubernetes version 1.23 on 2023-02-21 09:00-16:00 UTC [1] (the actual process is expected take a couple of hours within this window). The date was chosen for convenience as we will have depooled all active/active services from codfw for row B switch maintenance [2] anyways. As all traffic will be drained beforehand we expect no user visible impact. However, for the duration of the process, the kubernetes cluster will be unavailable to deployers and thus efforts to deploy to it will fail or worse, not have the expected outcomes. This is normal until SRE serviceops announces that the cluster is fully operational again.
SRE serviceops will be deploying all services before marking the cluster as usable and pooling traffic back to it, so there will be no need for deployers to re-deploy their services (apart from those already informed).
[1] https://phabricator.wikimedia.org/T329664 [2] https://phabricator.wikimedia.org/T327991
Regars, Janis Meybohm
Ops mailing list -- ops@lists.wikimedia.org To unsubscribe send an email to ops-leave@lists.wikimedia.org
wikitech-l@lists.wikimedia.org