Hi,
We will be upgrading the Toolforge Kubernetes cluster cluster on October
18th starting at around 11:00 UTC.
The expected impact is that tools running on the Kubernetes cluster will
get restarted a couple of times over the course of the few hours it takes
for us to upgrade the entire cluster. This includes tools that use the jobs
framework and tools that run web services using the default Kubernetes
backend. The ability to manage tools will remain operational.
Taavi
--
Taavi Väänänen (he/him)
Site Reliability Engineer, Cloud Services
Wikimedia Foundation
_______________________________________________
Cloud-announce mailing list -- cloud-announce(a)lists.wikimedia.org
List information: https://lists.wikimedia.org/postorius/lists/cloud-announce.lists.wikimedia.…
I see T336057 was just closed. Looking at the docs <https://wikitech.wikimedia.org/wiki/Help:Toolforge/Web#View_web_service_logs>, I'm unclear how this works. The docs say ", the output from the webservice command is stored by the Toolforge Kubernetes infrastructure as long as the web service is running." So, what happens when a service exits (i.e. crashes)? Does that mean the logs for that service disappear?
Hi,
there is a network maintenance work happening at the moment.
For the next few minutes, expect some brief network connectivity problems.
See also: https://phabricator.wikimedia.org/T347469
regards.
--
Arturo Borrero Gonzalez
Senior SRE / Wikimedia Cloud Services
Wikimedia Foundation
Hello, I'm writing to notify any interested party that I am stepping from
maintaining Quarry to focus on superset. I'm not sure if anyone here would
be interested in stepping up as a maintainer to keep Quarry running.
If no maintainer steps forward Quarry is likely to be removed when the
Buster images are removed by SRE, I believe that happens in June.
Should anyone be interested in being a maintainer for Quarry please let me
know and I will happily add you as one.
Some discussion can be found at https://phabricator.wikimedia.org/T169452
Thank you
--
*Vivian Rook (They/Them)*
Site Reliability Engineer
Wikimedia Foundation <https://wikimediafoundation.org/>
_______________________________________________
Cloud-announce mailing list -- cloud-announce(a)lists.wikimedia.org
List information: https://lists.wikimedia.org/postorius/lists/cloud-announce.lists.wikimedia.…
There is an ongoing outage affecting all cloud vps projects (this includes
toolforge and paws) that prevents the machines from getting ip refreshes
(dchp client got uninstalled).
We are working on it and the service should be restored soon, will update
once everything is up and running.
Working task https://phabricator.wikimedia.org/T347665
Feel free to add a message there if your project is affected, we will make
sure to verify that it's back online once we roll out the fix.
Thanks for your patience!
_______________________________________________
Cloud-announce mailing list -- cloud-announce(a)lists.wikimedia.org
List information: https://lists.wikimedia.org/postorius/lists/cloud-announce.lists.wikimedia.…
Hello,
today 2023-09-20 we will conduct a maintenance operation on Cloud VPS. The
operation involves moving the openstack API endpoint to a new set of hardware
servers. No action is required from your side.
The Cloud VPS control plane, including Horizon, may be intermittently
unavailable during the operations. Virtual machines should keep running
unaffected, and the same for Toolforge tools.
See also: https://phabricator.wikimedia.org/T346439
regards.
--
Arturo Borrero Gonzalez
Senior SRE / Wikimedia Cloud Services
Wikimedia Foundation
Hi there,
We recently did some DNS maintenance operations [0]. As part of the maintenance,
we moved the DNS recursor server IP address to be 172.20.255.1.
However, we have detected a number of virtual machines with broken puppet, that
did not pick up the change.
This is traditionally done via the file /etc/resolv.conf with a line like:
=== 8< ===
nameserver 172.20.255.1
=== 8< ===
Please fix the virtual machines to use the new resolver, or if the machine is
not in use, consider shutdown (or delete) it. Feel free to ask for assistance if
required [1].
After some period of time, the old resolver addresses will stop working and
unmaintained virtual machines will become even more broken.
thanks, regards.
[0]
https://lists.wikimedia.org/hyperkitty/list/cloud-announce@lists.wikimedia.…
[1]
https://wikitech.wikimedia.org/wiki/Help:Cloud_Services_introduction#Commun…
--
Arturo Borrero Gonzalez
Senior SRE / Wikimedia Cloud Services
Wikimedia Foundation
Hi,
today 2023-09-11 we will be conducting some internal Cloud VPS DNS service
operations:
* change the DNS recursor of every virtual machine running Cloud VPS from
208.80.154.143 and 208.80.154.24 to 172.20.255.1 (this is traditionally done via
/etc/resolv.conf)
* change the real server behind the authorizative DNS
ns1.openstack.eqiad1.wikimediacloud.org, including the IP address, from
208.80.154.11 to 185.15.56.163
This may affect briefly some virtual machines, but the new DNS servers have been
running for a while already and we are not anticipating a major impact (famous
last words?).
Please report any problems you may find.
Some phabricator tickets tracking this work are:
* https://phabricator.wikimedia.org/T345240 cloudservices1006: put into service
* https://phabricator.wikimedia.org/T346033 cloudservices1004: decomission
* https://phabricator.wikimedia.org/T342621 eqiad1: cloudlb: transition DNS
clients (VMs) to the new BGP-based recursor VIP
regards.
--
Arturo Borrero Gonzalez
Senior SRE / Wikimedia Cloud Services
Wikimedia Foundation
In order to get long running queries running in superset moving to the in
cluster database is needed. Further description in
https://phabricator.wikimedia.org/T340623
Unfortunately the external mysql db is not directly compatible with the
internal postgres db. Described in https://phabricator.wikimedia.org/T343526
As such the database move will drop the existing data. However for anything
that is of interest, the old instance will be available until 2023-10-16 at:
https://old-superset.wmcloud.org/
From there you should be able to login, go to the section with the desired
object (Dashboards, charts, datasets, saved queries), mark "Bulk Select"
select anything that is desired, then export. On the new superset
(available at superset.wmcloud.org starting on 2023-09-11), go to the same
section and push the import button (which looks suspiciously like a
download button) and import the file that you exported from the old
superset.
--
*Vivian Rook (They/Them)*
Site Reliability Engineer
Wikimedia Foundation <https://wikimediafoundation.org/>
_______________________________________________
Cloud-announce mailing list -- cloud-announce(a)lists.wikimedia.org
List information: https://lists.wikimedia.org/postorius/lists/cloud-announce.lists.wikimedia.…
After nearly a decade of mishap and delay, we have updated the WMCS
terms of use. The updated document for toolforge and cloud-vps admins
can be found here:
https://wikitech.wikimedia.org/wiki/Wikitech:Cloud_Services_Terms_of_use
and the terms of use for visitors to WMCS sites can be found here:
https://wikitech.wikimedia.org/wiki/Wikitech:Cloud_Services_End_User_Terms_…
There is one significant change in these terms: Cloud-vps projects which
collect personal data will need to include an explicit privacy policy
for their projects. This is section 7.3. For other WMCS users and admins
these documents do not represent any significant change in policy, but
do clarify and finalize many things that were poorly-worded in the
previous TOU, or policies that we have enforced informally without
officially stating.
Please feel free to reach out to WMCS staff if you find any part of
these documents concerning or disruptive to your work on our platforms.
-Andrew
_______________________________________________
Cloud-announce mailing list -- cloud-announce(a)lists.wikimedia.org
List information: https://lists.wikimedia.org/postorius/lists/cloud-announce.lists.wikimedia.…