On Wed, Dec 6, 2017 at 11:03 AM, Luca Toscano <ltoscano@wikimedia.org> wrote:
Hi everybody,

as outlined in https://phabricator.wikimedia.org/T181518 the Analytics team needs to repurpose the notebook1002 host (one of the PAWS/Jupyter nodes) as Kafka Analytics broker for a urgent maintenance procedure.
 
To clarify: I understand that's about SWAP (the internal analytics notebooks platform for accessing private data like the webrequest table in Hive and EventLogging tables in MySQL), not the public PAWS platform, correct?
 
We are not aware of anybody actively using it (as it happens with notebook1001) but to be on the safe side all the home directories will be saved on notebook1001's /srv directory in case somebody needs that data.
It sounded like the second machine was actually more intended and needed for stability and maintainability, rather than for load balancing? (The instructions at https://wikitech.wikimedia.org/wiki/SWAP#Access only mention notebook1001 as access point, so it wouldn't be surprising if fewer users went to notebook1002.) So does this recommissioning have implications on the stability and maintainability of SWAP? Just as an example, would we still be able to upgrade the Jupyter version without hassle (it runs 4.2.0 which is one and a half years old at this point, and quite a few bug fixes and features behind the current version, 5.2.1)? 


We are in the process of ordering new hardware to replace the current notebook1001 and 1002 hosts, so the absence of notebook1002 will be only temporary.
Is there a Phab ticket for this? (At https://www.mediawiki.org/wiki/Wikimedia_Technology/Goals/2017-18_Q2 there is a link labeled "Hardware refresh jupyter notebooks", but it is 404.)

In any case, thanks for your work in this area (and for posting the heads-up here)! SWAP is a really important tool for the work of myself and the other data analysts in the Audiences department, and other people's as well.


--
Tilman Bayer
Senior Analyst
Wikimedia Foundation
IRC (Freenode): HaeB