Analytics-announce November 2020

analytics-announce@lists.wikimedia.org

4 participants
5 discussions

Downtime for stat1004 - Nov 25th 16 CET
by Luca Toscano 24 Nov '20

24 Nov '20

Hi everybody, I'd need to shutdown stat1004 tomorrow (Nov 25th) at around 16:00 CET to allow SRE to move it to another rack (physical move in the datacenter). The downtime should be minimal (max half an hour). Please let me know if this impacts your work! Also added to https://wikitech.wikimedia.org/wiki/Analytics/Systems/Maintenance_Schedule Luca (on behalf of the Analytics team)

1 0

Downtime of stat1005 and stat1008 on Monday 5th for RAM expansion
by Luca Toscano 17 Nov '20

17 Nov '20

Hi everybody, on Monday 5th there will be some downtime of stat1005 and stat1008 (hopefully one hour max in total) to expand their RAM to 1.5TB (!!!!). The maintenance is scheduled to start around 16 CEST. As always, please let us know if this impacts your work! Luca (on behalf of the Analytics team)

3 8

Maintenance on stat1008 on Monday, 2020-11-16
by Tobias Klausmann 16 Nov '20

16 Nov '20

Hi! Next Monday, 2020-11016, I will be doing some maintenance on stat1008 in the EU/CET morning. During this, there will be disruption of everything there and there will be multiple reboots. Afterwards, the machine will be running a newer kernel (5.8) and updated GPU drivers/rocm library (3.8). This is the same update as the one I did the week before last, on stat1005. If you have any questions or concerns, let us know. Best, Tobias -- Tobias Klausmann, SRE, Wikimedia Foundation

1 1

Webrequest table change on Monday, November 23rd
by Joseph Allemandou 09 Nov '20

09 Nov '20

Hi Data Folks, *TL;DR:* We plan to update the wmf.webrequest on on Monday, November 23rd with this change <https://gerrit.wikimedia.org/r/c/analytics/refinery/+/638086> - Please get in touch on this task <https://phabricator.wikimedia.org/T267008> if you run hive queries taking advantage of the TABLESAMPLE feature on this table. *Why?* Testing the changes, we have seen: - more than 15% of global CPU time gain per computed partition, saving more than 300 hours of CPU per month. - Clock-wall time of webrequest load job divided by almost two (when the cluster is not busy) - Decrease of disk and network usage through smaller data to shuffle-sort. We cut in two of the amount of data to be written/sent/read. *What changes?* The change visible to users of the table is the increase of the number of buckets by which the table is bucketed, from 64 to 256. This means that for any leaf partition (webrequest_source, year, month, day, hour - actual folders where data files are stored), there will be 256 files instead of 64. The bucketing strategy won't change, meaning that the shuffling of rows between the files will still be done using the (hostname, sequence) fields pair in that order. Changes invisible to users are improvements in the hive query loading/augmenting the data into the partitions. *How does the change impact users?* We plan to drop the table (the structure, not the data!) and recreate it with the new bucketing number, re-adding existing partitions. This drop-recreate should go unnoticed as it is fast to execute. As new data flows in and old data is deleted, it will take 3 month for the whole table to be converted. During those three month, partitions containing 64 files will still be usable, but the queries taking advantage of buckets through the TABLESAMPLE feature will be broken for those partitions. Don't hesitate to reach out if you have questions :) -- Joseph Allemandou (joal) (he / him) on behalf of the Analytics-Engineering team Staff Data Engineer Wikimedia Foundation

1 0

Temporary Hive/Presto/Oozie/Airflow downtime on Nov 3rd - 15->17 CET
by Luca Toscano 02 Nov '20

02 Nov '20

Hi everybody, we are going to expand the available RAM on an-coord1001, the host in the Analytics infrastructure that runs Hive/Presto/Oozie/Airflow. The procedure should last 30/40 minutes in the optimistic case, and it will involve shutting down the host (hence all daemons running on it) to allow SRE to install the new RAM modules. As always please reach out to us if this impacts your work. Thanks in advance, Luca (on behalf of the Analytics team)

1 0

2024

2023

2022

2021

2020

Analytics-announce November 2020