Yesterday, I reported some log messages in T318479:
When I went to look at this again today, the messages were gone. After a bit of head-scratching, I discovered they were now in a .nfs file:
(venv) spi-tools-dev [django] grep 76e999afc82c10fb99b6c9bf76448d1a .nfs0000000005f910c800000388
2022-09-28 16:43:39,873 [76e999afc82c10fb99b6c9bf76448d1a] INFO tools_app.middleware: IndexView()
2022-09-28 16:59:18,903 [76e999afc82c10fb99b6c9bf76448d1a] ERROR tools_app.redis: Redis ConnectionError: Error while reading from tools-redis.svc.eqiad.wmflabs:6379 : (110, 'Connection timed out')
2022-09-28 16:59:19,196 [76e999afc82c10fb99b6c9bf76448d1a] INFO tools_app.middleware: request took 0:15:39.323408
These log files are created by Python's TimedRotatingFileHandler. So it looks like something was holding the file open at the time it was rotated. In theory, I should be able to find what process has them open using lsof, but that doesn't work when I run it on tools-sgebastion-11:
lsof .nfs0000000005f910c800000388
lsof: WARNING: can't stat() tracefs file system /sys/kernel/debug/tracing
Output information may be incomplete.
and if I shell into the krb instance, I just get:
bash: lsof: command not found
So how do I figure out what's going on?