*Scheduled downtime for Hadoop - Monday Jan 15th - 10:00 until 12:00 UTC*
Hello,
We need to perform some maintenance on our primary Hadoop cluster, which will require a period of *downtime*. This work is scheduled for *Monday Jan 15th - 10:00 until 12:00 UTC* - which is a US holiday for WMF and also Wikipedia Day https://en.wikipedia.org/wiki/Wikipedia:Wikipedia_Day.
This 2 hour maintenance window has been chosen in the hope of minimising disruption for you, whilst the cluster and the various tools that depend upon it, such as Superset and JupyterLab, are largely unavailable.
The work being undertaken is a replacement of the Hadoop nameserver hosts https://phabricator.wikimedia.org/T332573 which, unfortunately, requires a full cluster restart. We will be disabling ingestion to HDFS, pausing Airflow DAGs on all instances, and stopping production data processing pipelines, prior to the work, then re-enabling them all afterwards. We are not expecting any gaps in data, once the pipelines have caught up again.
If you have any queries or concerns about this work, or the time or date is particularly in convenient for you, please don't hesitate to let us know, so that we can look to reschedule.
Kind regards, Ben
This is a quick reminder that this work on Hadoop will start in about 20 minutes' time.
Please refrain from launching any new jobs on the cluster and be aware that the cluster will have decreased availability for up to a couple of hours.
Kind regards, Ben
On 11/01/2024 12:23 pm, Ben Tullis wrote:
*Scheduled downtime for Hadoop - Monday Jan 15th - 10:00 until 12:00 UTC*
Hello,
We need to perform some maintenance on our primary Hadoop cluster, which will require a period of *downtime*. This work is scheduled for *Monday Jan 15th - 10:00 until 12:00 UTC* - which is a US holiday for WMF and also Wikipedia Day https://en.wikipedia.org/wiki/Wikipedia:Wikipedia_Day.
This 2 hour maintenance window has been chosen in the hope of minimising disruption for you, whilst the cluster and the various tools that depend upon it, such as Superset and JupyterLab, are largely unavailable.
The work being undertaken is a replacement of the Hadoop nameserver hosts https://phabricator.wikimedia.org/T332573 which, unfortunately, requires a full cluster restart. We will be disabling ingestion to HDFS, pausing Airflow DAGs on all instances, and stopping production data processing pipelines, prior to the work, then re-enabling them all afterwards. We are not expecting any gaps in data, once the pipelines have caught up again.
If you have any queries or concerns about this work, or the time or date is particularly in convenient for you, please don't hesitate to let us know, so that we can look to reschedule.
Kind regards, Ben
-- *Ben Tullis*(he/him) Senior Site Reliability Engineer Wikimedia Foundation https://wikimediafoundation.org/