Hello!

wget was the tool I was using with jsub-Environment, but wget is not available any more in kubernetes (with toolforge jobs start …) :-(

$ webservice php7.4 shell
tools.persondata@shell-1705135256:~$ wget
bash: wget: command not found


Wolfgang


Am Sa., 13. Jan. 2024 um 02:20 Uhr schrieb Platonides <platonides@gmail.com>:
Gerhard said that for him the downloading job ran for about 12 hours. It seems the connection was closed.
I wouldn't be surprised if this was facing a similar problem as https://phabricator.wikimedia.org/T351876

With such long download time, it isn't that strange that there could be connection errors (still something to look into, though, toolserver-to-Prod shouldn't be suffering that).

wget (used by Gerhard) retries automatically, perhaps curl isn't and is thus more susceptible to these errors.

Try changing your job to
wget -O - https://dumps.wikimedia.org/wikidatawiki/latest/wikidatawiki-latest-pages-articles-multistream.xml.bz2 | bzip2 -d | tail