Hello,
Since June 27th, any CI job running 'npm install' might suffer from a 10 minutes extra delay.
Somehow when requesting package informations from the NpmJS CDN (CloudFlare), the connection holds for ten minutes. npm just idles waiting for a reply. Then eventually it shows:
npm ERR! registry error parsing json
npm then retry and process as usual.
The json error is due to a CloudFlare HTML page stating:
The page could not be rendered due to a temporary fault.
The impact is any Jenkins job using npm have a high chance of taking 10 more minutes to build. That notably impacts MediaWiki core and all its extensions.
A few minutes ago, I have made a change to run npm with --loglevel=info which would give some hints about what it is doing by causing npm to emit more informations in the console. (verbose would be way too much log though).
I have filled a bug to npm: https://github.com/npm/npm/issues/21101 Our task: https://phabricator.wikimedia.org/T198348
I have no idea how to mitigate the issue :-(
If we were to upgrade npm to 5+ (I think) that would support package-lock.json https://docs.npmjs.com/files/package-lock.json
That file would be committed to our code repository and allow the CI to skip the dependency resolution step(s). This means that the tar/zip or clone would happen directly without involving registry.npmjs.org at all.
Upgrading npm (and adding a package-lock.json file) might be outside of the scope of this issue, but I think it would help mitigate/resolve the problem.
On Thu, Jun 28, 2018 at 10:26 AM Antoine Musso hashar+wmf@free.fr wrote:
Hello,
Since June 27th, any CI job running 'npm install' might suffer from a 10 minutes extra delay.
Somehow when requesting package informations from the NpmJS CDN (CloudFlare), the connection holds for ten minutes. npm just idles waiting for a reply. Then eventually it shows:
npm ERR! registry error parsing json
npm then retry and process as usual.
The json error is due to a CloudFlare HTML page stating:
The page could not be rendered due to a temporary fault.
The impact is any Jenkins job using npm have a high chance of taking 10 more minutes to build. That notably impacts MediaWiki core and all its extensions.
A few minutes ago, I have made a change to run npm with --loglevel=info which would give some hints about what it is doing by causing npm to emit more informations in the console. (verbose would be way too much log though).
I have filled a bug to npm: https://github.com/npm/npm/issues/21101 Our task: https://phabricator.wikimedia.org/T198348
I have no idea how to mitigate the issue :-(
-- Antoine "hashar" Musso
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
On Thu, Jun 28, 2018 at 4:42 PM David Barratt dbarratt@wikimedia.org wrote:
If we were to upgrade npm to 5+ (I think) that would support package-lock.json
That is discussed in https://phabricator.wikimedia.org/T179229
Željko
On 28/06/2018 16:25, Antoine Musso wrote:
Hello,
Since June 27th, any CI job running 'npm install' might suffer from a 10 minutes extra delay.
Somehow when requesting package informations from the NpmJS CDN (CloudFlare), the connection holds for ten minutes. npm just idles waiting for a reply. Then eventually it shows:
npm ERR! registry error parsing json
<snip>
Npmjs seems to have implemented a fix although we are still hitting the issue: https://status.npmjs.org/incidents/51c7q80zsj9f
A few minutes ago, I have bumped the default timeout from 30 minutes to 45 minutes. So jobs will still be slow, but at least they should succeed (when they should).
https://gerrit.wikimedia.org/r/#/c/integration/config/+/442988/
On 28/06/2018 23:28, Antoine Musso wrote:
npm ERR! registry error parsing json
<snip> > Our task: https://phabricator.wikimedia.org/T198348 Npmjs seems to have implemented a fix although we are still hitting the issue: https://status.npmjs.org/incidents/51c7q80zsj9f
A few minutes ago, I have bumped the default timeout from 30 minutes to 45 minutes. So jobs will still be slow, but at least they should succeed (when they should).
https://gerrit.wikimedia.org/r/#/c/integration/config/+/442988/
Hello,
The issue from June 28th has been resolved but appeared again today.
CI jobs using npm once again are showing the delay issue. Same symptom:
npm ERR! registry error parsing json
I have bumped the job timeout again from 30 minutes to 45 minutes and reopen the task https://phabricator.wikimedia.org/T198348
wikitech-l@lists.wikimedia.org