Yes, currently the logs subcommand in webservice and toolforge-jobs tools
can only read logs from a currently running tool, and logs are lost when a
service crashes or is stopped or restarted.
For some backstory: As you know, the current NFS storage setup we currently
use for both tool code and logs is a major pain to keep up and running, in
addition to having a very poor user experience when trying to follow logs
in real-time. The new build service
<https://wikitech.wikimedia.org/wiki/Help:Toolforge/Build_Service> moves
tool code off NFS which unblocks running a tool without any NFS mounts. The
current versions of the logs subcommands are meant to provide at least some
way to read logs for these early off-NFS tools - it's essentially a fancy
wrapper for `kubectl logs` at this point. Of course there are some tools
that need longer log retention, but for simple ones (like db-names
<https://db-names.toolforge.org/> which I'm using to test many new
buildservice features) it's perfectly usable.
The idea is to swap the commands to use a better log management system once
we have deployed one to Toolforge. The good news is that one of the major
blockers for that (lack of object storage) is almost solved, so I'm hoping
to see some movement for that project fairly soon (although, as usual, it's
really hard to promise anything). I expect that project will be tracked in
subtasks of T127367 <https://phabricator.wikimedia.org/T127367> once
there's anything beyond a very rough idea.
Taavi
On Thu, Oct 5, 2023 at 9:10 PM Roy Smith <roy(a)panix.com> wrote:
I see T336057 was just closed. Looking at the docs
<https://wikitech.wikimedia.org/wiki/Help:Toolforge/Web#View_web_service_logs>,
I'm unclear how this works. The docs say ", the output from the webservice
command is stored by the Toolforge Kubernetes infrastructure as long as the
web service is running." So, what happens when a service exits (i.e.
crashes)? Does that mean the logs for that service disappear?
_______________________________________________
Cloud mailing list -- cloud(a)lists.wikimedia.org
List information:
https://lists.wikimedia.org/postorius/lists/cloud.lists.wikimedia.org/
--
Taavi Väänänen (he/him)
Site Reliability Engineer, Cloud Services
Wikimedia Foundation