Hi Tilman,
2017-12-15 8:53 GMT+01:00 Tilman Bayer <tbayer(a)wikimedia.org>rg>:
On Wed, Dec 6, 2017 at 11:03 AM, Luca Toscano <ltoscano(a)wikimedia.org>
wrote:
Hi everybody,
as outlined in
https://phabricator.wikimedia.org/T181518 the Analytics
team needs to repurpose the notebook1002 host (one of the PAWS/Jupyter
nodes) as Kafka Analytics broker for a urgent maintenance procedure.
To clarify: I understand that's about SWAP
<https://wikitech.wikimedia.org/wiki/SWAP> (the internal analytics
notebooks platform for accessing private data like the webrequest table in
Hive and EventLogging tables in MySQL), not the public PAWS
<https://www.mediawiki.org/wiki/PAWS> platform, correct?
Correct, I didn't know the difference, thanks for the pointer.
We are not aware of anybody actively using it (as it
happens with
notebook1001) but to be on the safe side all the
home directories will be
saved on notebook1001's /srv directory in case somebody needs that data.
It sounded like the second machine was actually more intended and needed
for stability and maintainability, rather than for load balancing? (The
instructions at
https://wikitech.wikimedia.org/wiki/SWAP#Access only
mention notebook1001 as access point, so it wouldn't be surprising if fewer
users went to notebook1002.) So does this recommissioning have implications
on the stability and maintainability of SWAP? Just as an example, would we
still be able to upgrade the Jupyter version without hassle (it runs 4.2.0
which is one and a half years old at this point, and quite a few bug fixes
and features behind the current version, 5.2.1)?
We'll have new hardware soon and notebook100[12] will be running on new
hosts very soon, so I don't picture any issue for the medium/long term. It
is only a temporary measure for the immediate short term.
We are in the process of ordering new hardware to
replace the current
notebook1001 and 1002 hosts, so the absence of
notebook1002 will be only
temporary.
Is there a Phab ticket for this? (At
https://www.mediawiki.org/
wiki/Wikimedia_Technology/Goals/2017-18_Q2 there is a link labeled
"Hardware refresh jupyter notebooks", but it is 404.)
The task is
https://phabricator.wikimedia.org/T175603, marked as
Operations/Procurement, its custom view policy might not work for
everybody. We'll open a phab task to track the replacement of the
notebook100[12] nodes as soon as they will be ready in the data center, and
post the link to this mailing list so people will be aware.
In any case, thanks for your work in this area (and
for posting the
heads-up here)! SWAP is a really important tool for the work of myself and
the other data analysts in the Audiences department, and other people's as
well.
The Analytics team will do as much as possible to support SWAP, we do value
this project :)
Luca