We'd like to schedule a maintenance window for all the stat machines:
*In this maintenance window, we will:*
* Upgrade the conda-analytics debian package. This package is how we deploy
Spark3. The upgrade will allow us to run JupyterHub and JupyterLab on top
* Deploy new puppet configurations that will switch from running Jupyter on
top of anaconda-wmf to running on top of conda-analytics.
* This effectively upgrades our Jupyter deployment as follows:
* Upgrades JupyterHub from 1.1.0 to 1.5.0.
* Upgrades JupyterLab from 3.2.9 to 3.4.8.
* Upgrades Spark on newly created conda environments from 2.4.4 to
* Upgrades wmfdata on newly created conda environments to 2.0.0. This
version of wmfdata includes breaking changes.
*IMPORTANT*: After this maintenance window, you should expect the following:
*** JupyterHub and any existing JupyterLab processes including any running
kernels, will be shut down, and will have to be restarted manually by the
respective owners. ***
*** JupyterLab UI will have minor changes. ***
*** New conda environments created via JupyterHub will now be based off of
conda-analytics and will utilize Spark3. ***
*** Existing conda environments based off of anaconda-wmf will continue to
work, and continue to utilize Spark2. ***
As part of our effort to deprecate Spark2 and make Spark3 widely available,
we are deprecating the use of anaconda-wmf and Spark2, in favor of
conda-analytics and Spark3.
Note this is just deprecation, you will still be able to use your existing
conda environments running on top of anaconda-wmf and Spark2.
We are proposing the following window for these changes:
Wednesday 30 Nov 2022 12:30 to 13:30 UTC.
(7:30 AM ET / 4:30 AM PT)
The old anaconda-wmf base conda environment:
The new conda-analytics base conda environment:
Please let us know if you have any questions or objections.
Xabriel J. Collazo Mojica (he/him, pronunciation
Sr Software Engineer
Looking for a time to reboot two of our analytics explorer (stat) servers
for a kernel upgrade.
These are stat1005.eqiad.wmnet and stat1008.eqiad.wmnet.
I would like to handle the reboots on *Tuesday 15th November 2022 between
06:00 UTC and 06:30 UTC*
Kindly let me know if a maintenance window within these times would cause
an inconvenience then I can push back the reboots to accomodate your needs.
The Data Engineering team is upgrading to Spark 3 and will no longer be
supporting Spark 2 jobs on the Hadoop cluster after March 31st, 2023. If
your team owns Spark 2 jobs in production, please plan for the time needed
to upgrade your jobs. For all future work use Spark 3.
You can find more information about the upgrade on:
Please add any missing jobs to the migration list on that page. If you need
help from the data engineering team you can reach out to Jackeline Argüello
<jarguello-ctr(a)wikimedia.org> or join us for the data engineering office
*Olja Dimitrijevic* (she/her)
Director of Data Engineering
Wikimedia Foundation <https://wikimediafoundation.org/>