Hi everybody,
the Analytics team has packaged a new version of Hue (basically the latest
upstream), 4.7.1, available for testing in https://hue-next.wikimedia.org
Several improvements:
- Python 3 support
- CAS SSO support
- User auto-creation upon first login (no more sync from LDAP etc..)
Current problems:
- We had to follow up with several patches to upstream (
https://github.com/cloudera/hue) due to py2->py3 porting issues, plus also
JS ones.
- The UI look and feel is very different from what we are used to, and some
viz bugs are still present (see https://phabricator.wikimedia.org/T264896)
The plan is to move hue.wikimedia.org to the new version as soon as
possible, and then fix bugs that are outstanding as we go. With this
upgrade we don't want to state that Hue will be supported forever (on the
contrary, we'd still love to deprecate it), but for the time being (namely
until we find a replacement for Oozie for example) Hue will remain
supported.
Any feedback about bugs etc.. would be really appreciated in
https://phabricator.wikimedia.org/T264896. Also let us know what you think
about the new version (if you use Hue often).
Thanks in advance,
Luca
Hi!
This Friday, 2020-10-30, I will be doing some maintenance on stat1005 in the
EU/CET morning. During this, there will be disruption of everything there and
there will be multiple reboots. Afterwards, the machine will be running a newer
kernel (5.8) and updated GPU drivers/rocm library (3.8). Should the update
fail, or the subsequent tests show that workloads break, we will roll back to
4.19 and rocm33.
If you have any questions or concerns, let us know.
Best,
Tobias
--
Tobias Klausmann, SRE, Wikimedia Foundation
Hi!
In our quest to make teh GPU-equipped machines in analytics ever more useful,
we are going to update the rocm software suite and driver on stat1005 and
stat1008 to the latest version, 3.8.0.
Since this will necessitate a reboot, this is the early warning that on
2020-11-23 (Friday), I will update stat1005. Disruption will likely be less
than an hour. In case the update breaks stuff, we will roll back to v3.3.0.
The update of stat1008 will happen next week, on 2020-11-27 Tuesday, and there
will be a separate reminder for that on Monday.
I will send an all-clear message to these lists once the update is done. For
more details on the process, see https://phabricator.wikimedia.org/T264408
As always, if there is anything out of order, don't hesitate to contact us.
Best,
Tobias
--
Tobias Klausmann, SRE, Wikimedia Foundation
Hi,
We are planning maintenance work on the following hosts at 20:00 - 21:00
UTC on Wednesday October 21. During the maintenance window, each host will
become momentarily inaccessible for about a minute as we perform an upgrade.
hue.wikimedia.orgturnilo.wikimedia.orgsuperset.wikimedia.orgpiwik.wikimedia.org
For context on the change, see https://phabricator.wikimedia.org/T240439.
Our scheduled maintenance can be viewed at
https://wikitech.wikimedia.org/wiki/Analytics/Systems/Maintenance_Schedule.
This will only affect the web interfaces. Any processes that are running on
these hosts will not be affected (the machines will not be restarted).
We'll send a reminder before the maintenance happens and message out when
it is complete. If something doesn't look right, message me at razzi on
Freenode or reach us at the #wikimedia-analytics channel.
Regards,
Razzi & the Analytics team
Hi everybody,
we have to reboot stat1005 and stat1008 to pick up correct GPU settings,
and we'd like to do it on Friday 16th (early EU morning).
Please let us know if this impacts your work, we'll try to find another
time window in case :)
Maintenance also advertised in
https://wikitech.wikimedia.org/wiki/Analytics/Systems/Maintenance_Schedule
Luca (on behalf of the Analytics team)
Dear users of stat100{4,6,7},
we are planning on upgrading stat1004 to Debian Buster this Thursday
(2020-09-17) after 12:00 CEST (10:00 UTC). We will reinstall the machine,
preserving user data (home directories, /srv), but to be on the safe side,
we will backup that data. After the reinstall and a few tests, we will send
an all-clear to this list.
A few things of note:
- It would be greatly appreciated if you cleaned out unneeded data before
the
backup time mentioned above, thus speeding up backup (and restore if we
need
it).
- Any changes made to the file system contents after the time mentioned
above
may be lost.
- Around the time of the backup, both cron and systemd timers will be
disabled, and still-running process may be ungracefully terminated.
If this process works well, the remaining stat100x machines in need of
update
(6, 7) will be processed in a similar manner.
As always, if there are questions, do not hesitate to contact us.
Best,
Tobias
--
Tobias Klausmann, SRE, Wikimedia Foundation
Hi everybody,
Due to an issue with the GPU, we have to reboot stat1005. The scheduled
maintenance is set for Monday Oct 5th early EU mornings. Please let us know
if this is a problem :)
Luca (on behalf of the Analytics team)
Hi everybody,
We need to reboot stat1004 to apply some kernel settings. The maintenance
is scheduled for Friday 25th during early EU morning, please let us know if
this impacts your work.
Luca (on behalf of the Analytics team)
Hi everybody,
We created
https://wikitech.wikimedia.org/wiki/Analytics/Systems/Maintenance_Schedule
as an attempt to help all users to prepare for the upcoming maintenance
windows scheduled. Every maintenance window will be announced in this email
list and added to the wiki page, hope it helps!
Luca (on behalf of the Analytics team)