Debian Stretch's security support ends in mid 2022, and the Foundation's
OS policy already discourages use of existing Stretch machines. That
means that it's time for all project admins to start rebuilding your VMs
with Bullseye (or, if you must, Buster.)
Any webservices running in Kubernetes created in the last year or two
are most likely using Buster images already, so there's no action needed
for those. Older kubernetes jobs should be refreshed to use more modern
images whenever possible.
If you are still using the grid engine for webservices, we strongly
encourage you to migrate your jobs to Kubernetes. For other grid uses,
watch this space for future announcements about grid engine migration;
we don't yet have a solution prepared for that.
Details about the what and why for this process can be found here:
https://wikitech.wikimedia.org/wiki/News/Stretch_deprecation
Here is the deprecation timeline:
March 2021: Stretch VM creation disabled in most projects
July 6, 2021: Active support of Stretch ends, Stretch moves into LTS
<- You are Here ->
January 1st, 2022: Stretch VM creation disabled in all projects,
deprecation nagging begins in earnest. Stretch alternatives will be
available for tool migration in Toolforge
May 1, 2022: All active Stretch VMs will be shut down (but not deleted)
by WMCS admins. This includes Toolforge grid exec nodes.
June 30, 2022: LTS support for Debian Stretch ends, all Stretch VMs will
be deleted by WMCS admins
_______________________________________________
Cloud-announce mailing list -- cloud-announce(a)lists.wikimedia.org
List information: https://lists.wikimedia.org/postorius/lists/cloud-announce.lists.wikimedia.…
Hello cloud-vps users,
There are still about 84 unclaimed projects at
https://wikitech.wikimedia.org/wiki/News/Cloud_VPS_2021_Purge
Please take a moment to look at that page and mark projects that you are
using.
Unclaimed projects will be in danger of shutdown on February 1st, 2022.
Thank you to those of you who have already acted on this.
Thank you!
- Komla
-------- Forwarded Message --------
Subject: Cloud VPS users, please claim your projects (and, introducing
Komla)
Date: Thu, 2 Dec 2021 14:42:08 -0600
From: Andrew Bogott <abogott(a)wikimedia.org> <abogott(a)wikimedia.org>
Reply-To: abogott(a)wikimedia.org
Organization: The Wikimedia Foundation
To: Cloud-announce(a)lists.wikimedia.org
CC: Wikimedia developers <wikitech-l(a)lists.wikimedia.org>
<wikitech-l(a)lists.wikimedia.org>
Hello cloud-vps users!
It's time for our annual cleanup of unused projects and resources. Our new
developer advocate Komla Sapaty will be guiding this process; please
respond promptly to his emails and do your best to make him feel welcome!
Every year or so the Cloud Services team tries to identify and clean up
unused projects and VMs. We do this via an opt-in process: anyone can mark
a project as 'in use,' and that project will be preserved for another year.
I've created a wiki page that lists all existing projects, here:
https://wikitech.wikimedia.org/wiki/News/Cloud_VPS_2021_Purge
If you are a VPS user, please visit that page and mark any projects that
you use as {{Used}}. Note that it's not necessary for you to be a project
admin to mark something -- if you know that you're currently using a
resource and want to keep using it, go ahead and mark it accordingly. If
you /are/ a project admin, please take a moment to mark which VMs are or
aren't used in your projects.
When February arrives, I will shut down and begin the process of reclaiming
resources from unused projects.
If you think you use a VPS project but aren't sure which, I encourage you
to poke around on https://tools.wmflabs.org/openstack-browser/ to see what
looks familiar. Worst case, just email cloud(a)lists.wikimedia.org with a
description of your use case and we'll sort it out there.
Exclusive toolforge users are free to ignore this email and future related
things.
Thank you!
-Andrew and the WMCS team
_______________________________________________
Cloud-announce mailing list -- cloud-announce(a)lists.wikimedia.org
List information: https://lists.wikimedia.org/postorius/lists/cloud-announce.lists.wikimedia.…
Hi all,
I had previously found a way to mount the home directory of my tool on
Toolforge onto my local Linux machine using a *mount* command. It was so
handy that I turned it into a bash alias and used it on the daily.
Sadly, due to a hardware malfunction, I lost my machine and my
.bash_aliases file with it. I cannot find the mount command anywhere on
Wikitech wiki either. Do you happen to know what the command should look
like?
It is just so much more convenient to mount the drive and use local IDEs to
modify code.
PS: I promise to add it to the wiki so others can use it
Thanks,
Huji
Dear Wikimedia cloud support
What storage options does the Wikimedia cloud have? Can external developers
(i.e. people not employed by the Wikimedia foundation) write to Cinder
and/or Swift? Either from Toolforge or from Cloud VPS?
See below for context. (Actually, is this the right list, or should I ask
elsewhere?)
For Wikidata QRank [https://qrank.toolforge.org/], I run a cronjob on the
toolforge Kubernetes cluster. The cronjob mainly works on Wikidata dumps
and anonymized Wikimedia access logs, which it reads from the NFS-mounted
/public/dumps/public directory. Currently, the job produces 40 internal
files with a total size of 21G; these files need to be preserved between
individual cronjob runs. (In a forthcoming version of the cronjob, this
will grow to ~200 files with a total size of ~40G). For storing these
intermediate files, Cinder might be a good solution. However, afaik Cinder
isn’t available on Toolforge. Therefore, I’m currently storing the
intermediate files in the account’s home directory on NFS. Presumably (but
not sure, but speculating because I’ve seen NFS crumbling elsewhere)
Wikimedia’s NFS server will be easily overloaded; in any case, Wikimedia’s
NFS server seems to protect itself by throttling access. Because of the
throttling, the cronjob is slow when working with its intermediate files.
* Will Cinder be made available to Toolforge users? When?
* Or should I move from Toolforge to Cloud-VPS, so I can store my
intermediate files on Cinder?
* Or should I store my intermediate files in some object storage? Swift?
Ceph? Something else?
* Is access to Cinder and Swift subject to the same throttling as NFS? Or
will moving away from NFS increase the available I/O throughput?
The final output of the QRank system is a single file, currently ~100M in
size but eventually growing to ~1G. When the cronjob has computed a fresh
version of its output, it deletes any old outputs from previous runs (with
the exception of the previous last two versions, which are kept around
internally for debugging). Typical users are other bots or external
pipelines who need a signal for prioritizing Wikidata entities, not end
users on the web. Users typically check for updates with HTTP HEAD, or with
conditional HTTP GET requests (using the standard If-Modified-Since and
If-None-Match headers). Currently, I’m serving the output file with a
custom-written HTTP server that runs as a web service on
Toolforge behind Toolforge’s nginx instance. My server reads its content
from the NFS-mounted home directory that’s getting populated by the
cronjob. Now, it’s not exactly a great idea to serve large data files from
NFS, but afaik it’s the only option available in the Wikimedia cloud, at
least for Toolforge users. Of course I might be wrong.
* Should I move from Toolforge to Cloud-VPS, so I can serve my final output
files from Cinder instead of NFS?
* Or should I rather store my final output files in some object storage?
Swift? Ceph? Something else?
* Or is NFS just fine, even if the size of my data grows from 100M to 1G+?
The cronjob also uses ~5G of temporary files in /tmp, which it deletes
towards the end of each run. The temp files are used for external sorting,
so all access is sequential. I’m not sure where these temporary files
currently sit when running on Toolforge Kubernetes. Given their volume, I
presume that the tmpfs of the Kubernetes nodes will eventually run out of
memory and then fall back to disk, but I wouldn’t know how to find this
out. _If_ the backing store disk for tmpfs eventually ends up being mounted
on NFS, it sounds wasteful for the poor NFS server;, especially since the
files get deleted at job completion. In that case, I’d love to save common
resources by using a local disk. (It doesn’t have to be an SSD; a spinning
hard drive would be fine, given the sequential access pattern). But I’m not
sure how to set this up on Toolforge Kubernetes, and I couldn’t find docs
on wikitech. Actually, this might be a micro-optimization, so perhaps not
worth the trouble. But then, I’d like to be nice with the precious shared
resources in the Wikimedia cloud.
Sorry that I couldn’t find the answers online. While searching, I came
across the following pointers:
– https://wikitech.wikimedia.org/wiki/Ceph: This page has a warning that
it’s probably “no longer true”. If the warning is correct, perhaps the page
could be deleted entirely? Or maybe it could link to the current docs?
– https://wikitech.wikimedia.org/wiki/Swift: This sounds perfect, but the
page doesn’t mention how the files are getting populated, what the ACLs are
managed, and if Wikimedia’s Swift cluster is even accessible to external
developers.
– https://wikitech.wikimedia.org/wiki/Media_storage: This seems current (I
guess?), but the page doesn’t mention if/how external Toolforge/Cloud-VPS
users may upload objects, or if this is just for the current users.
Thanks for your help, and happy holidays,
— Sascha, sascha(a)brawer.ch
Hello everyone
We have noticed that since Septembrer 2020, our WLM-related site
www.wikilm.es has been receiving a couple of meta=siteinfo api queries per
day from Cloud VPS NAT egress 185.15.56.1 with a user agent of...
${user_agent}
(yes, literally that. Probably a case of single quotes when double were
needed)
It's probably some kind of Wiki Loves Monuments statistics tool. I have
looked on some suspects with no luck.
Anyone has an idea which tool may be doing that?
Regards
Hi everyone!
The second edition of the Coolest Tool Award
https://meta.wikimedia.org/wiki/Coolest_Tool_Award
will happen online on Friday 14 January 2022 at 17:00 UTC!
See https://zonestamp.toolforge.org/1642179615 for your timezone.
The awarded tools will be showcased in a virtual event, with
broadcasted video and chat channels for socializing.
We will send more details and links soon.
Save the date, and join us celebrating the great work volunteer
developers do for the Wikimedia communities.
We hope to see you there!
andre, for the Coolest Tool Academy 2021
--
Andre Klapper (he/him) | Bugwrangler / Developer Advocate
https://blogs.gnome.org/aklapper/
What's the best way to do this? I noticed a rebuild instance option and
tried it to upgrade to bullseye. It processed for awhile, rebooted, and
says the instance based on the bullseye image, but I don't see that it
upgraded anything.
Tim
Hi,
<https://packagist-mirror.wmcloud.org/> is exactly what it sounds like,
a mirror of the metadata on packagist.org. It's very simple, just a
systemd timer and some Apache config. Since setting it up 2 or 3 years
ago I've barely had to touch it. And based on access logs, some people
are actually using it (though we never switched CI over to it, as the
idea/goal once was).
But after being reminded about it by the current Cloud VPS cleanup, I'm
not interested in running this anymore. packagist.org is actually open
source (*looks at npmjs.com in sadness*) and hasn't had any reliability
issues recently AFAIR. There's also a bit of work to do, the VM needs to
be upgraded from stretch, and it should be switched over to the new
composer/mirror script from upstream.
So if someone would like to take it over, comment on
<https://phabricator.wikimedia.org/T296968> and I'm happy to add you and
explain how it works. Otherwise please consider this notice that I
intend to shut it down at the end of the year.
-- Legoktm
Hi all,
Starting Nov 7, a number of the jobs I would run through Toolforge grid
have stopped working. Each job consists of a .sh file like this
<https://github.com/PersianWikipedia/fawikibot/blob/master/HujiBot/grid/jobs…>
on the first line of which I use the source command to activate a python
virtual environment. When I run source by hand, subsequent lines work. But
when I call the .sh file and it tries to run the source command, I get a
"source: not found" message, the virtual environment does not get activated
and indeed running *which python* returns */usr/bin/python* which is bad.
All my scripts depend on pip packages that are installed in the virtual env
and not available with the system python.
The main thing I did on Nov 7 was to add a line at the end of my too's
account's .bash_profile as below:
exec zsh
This is because when I manually log into toolforge, I would like zsh to be
my shell, and since tool accounts don't support chsh, I thought executing
zsh directly from bash would be okay. But apparently, that now breaks the
source command somehow.
So I wonder:
(a) Is there a way to properly change the default shell of tool accounts?
(b) Is there a way to make *source* work under zsh?
Importantly, I know the problem is with *exec zsh* because once I removed
it and logged out and back in, all scripts worked correctly.
Thanks,
Huji