When Package Dependencies Become Problematic

Published 7 months ago6 min read- ViewsJames IvesJames Ives

Back in early January a couple of popular Node libraries were corrupted by the project maintainer during a new version release. Due to the way that npm dependencies work this created a ripple effect which caused a number of other very popular libraries to break. Without going into the maintainer's motivation behind their actions I wanted to briefly touch on how this happened and what you can do to protect your projects against such a thing.

Monte in a box

Before we get started, a quick refresher. If you've worked with an npm based project before you'll be familiar with the package.json and package-lock.json files along with the node_modules directory. When running npm install the package.json file writes meta data to the lock file which tells npm which versions to fetch from the registry. It then stores those downloaded dependencies in the node_modules folder.

Ok So What?

So far this might seem obvious but give me a moment to explain. Let's take a closer look at the contents of a package.json file.

{
  "name": "@jamesives/not-a-real-project",
  "author": "James Ives",
  "dependencies": {
    "jest": "27.0.6",
    "lit": "^2.0.0",
    "rollup": "^2.0.0"
  }
}

After running npm install it will store the versions it needs in the lock file and then fetch the associated packages from the registry. When we inspect the lock file it paints a different picture in comparison to package.json. You'll notice that the versions of two of the packages don't match. It downloaded versions 2.67.0 and 2.1.2 when ^2.0.0 was specified.

Jives:not-a-real-project ives$ npm list --depth=0
@jamesives/not-a-real-project@1.0.0
├── jest@27.0.6
├── lit@2.1.2
└── rollup@2.67.0

The cause of this discrepancy is the ^ symbol. When this symbol is prepended to the version number of a package it tells npm to fetch a compatible with version. It will include everything that does not increment the first non-zero portion of the number. This means if a package has a version 2.0.0, 2.1.0 and 2.2.2 on the registry and you put ^2.0.0 in your package dependencies, you'll install 2.2.2 the next time you run npm install which will be reflected in your lock file.

The reason I'm specifically calling this out is because this is the way npm adds packages to the dependencies list file by default.

The Inherited Risk of Trusting Semvar

In a perfect world if a project is following semantic versioning you technically should have nothing to worry about as you'll never install a version that isn't compatible. You can also certainly make the argument that it will improve the security of your project as you'll often get the latest patches through regular feature development each time you run the install command. However, this isn't always the case. It's not a requirement that a library follows any form of semantic versioning (only suggested) and it's very possible for a breaking change to be introduced through a minor version or even a patch. It's a poor assumption to assume that all open source maintainers are aware of this recommendation or care to follow it.

Coming back to the library I mentioned earlier. Compatible with versioning is how many projects became infected. The latest major version was 6.0.0 and the infected version published was 6.6.6. This means that anyone with ^6.0.0 in their package dependencies would get the infected version the next time they installed it. This caused such a big issue that GitHub and npm had to step in to take action against the user and remove the infected versions from the registry.

It All Comes Crashing Down

Where things can take a turn for the worse is when you use continuous integration (ci) tools such as GitHub Actions or Jenkins for your deployment pipelines. Let's take the following example from the Jenkins website:

pipeline {
    agent {
        docker {
            image 'node:lts-buster-slim'
            args '-p 3000:3000'
        }
    }
    stages {
        stage('Build') {
            steps {
                sh 'npm install'
            }
        }
        stage('Test') {
            steps {
                sh '/images/blog/2022-02-13-when-package-dependencies-become-problematic/jenkins/scripts/test.sh'
            }
        }
        stage('Deliver') { 
            steps {
                sh '/images/blog/2022-02-13-when-package-dependencies-become-problematic/jenkins/scripts/deliver.sh' 
            }
        }
    }
}

In this example let's assume that your package file looks something similar to the one above and you're very careful about which versions you commit to the lock file. Similar to when you run the install command locally Jenkins will do the same thing and write newer versions it needs to the lock file. Even if you think you're using the latest version of a package if a new version gets published before Jenkins runs the production build it's going to modify the lock file which in turn will cause it to install a version you never tested your application with. This can introduce unexpected bugs, breakages, or even introduce a security vulnerability to your application.

That Sounds Scary…

It is, but it's not all doom and gloom. Let's go over the options.

npm ci

Use npm ci (named after continuous integration) in your ci pipelines instead of install. This will force npm to delete the existing node_modules folder and install the versions in the lock file as opposed to the versions specified in the package.json file. It will essentially disregard the package.json file entirely, instead only using it to validate that there are no differences between the two. This will ensure that the version you commit to the lock file will be the one your build tools use, making them much more predictable, stable, and safer.

On GitHub alone, there are over a million instances of npm install in .yml files which at a glance mostly belong to ci pipelines.

Use exact version numbers

I'm of the opinion that exact version numbers are much better than using compatibility tagging. It's more readable in the sense that it allows you to see at a glance which versions are installed, and it's more predictable. Mistakenly committing dependency bumps without proper testing isn't ideal and it's better that you dedicate proper time and effort to the process. Just because I mentioned npm in this article other ecosystems can suffer the same consequences. Even GitHub themselves suggest that project maintainers offer a major version tag for GitHub Actions which can have serious consequences on the consuming project. If the maintainer overwrites that tag they can introduce a breaking change or vulnerability the next time your workflow runs.

Use Dependabot or any other form of dependency management

You can leverage Dependabot or any other form of external version management tooling to make dependency bumps hassle-free. If Dependabot isn't an option for you you can instead use npm outdated to get a manifest of packages that don't match the latest available. Using your best judgment you can test and integrate them into your project manually.

Jives:@jamesives/not-a-real-project ives$ npm outdated
Package  Current  Wanted  Latest  Location
jest      27.0.6  27.0.6  27.5.1  @jamesives/not-a-real-project
lit        2.1.2   2.1.3   2.1.3  @jamesives/not-a-real-project
rollup    2.67.0  2.67.2  2.67.2  @jamesives/not-a-real-project

For GitHub Enterprise users the Dependabot pull request script is available until proper Dependabot support is offered.

In Conclusion

Even though these problems are rare it's always important to optimize for the worst possible case. If you have any comments or questions you can reach me on Twitter or leave a note below.