The pain that minimal version selection solves

By Nick Snyder on August 31, 2018


Minimal version selection is an idea that Russ Cox proposed for how to resolve the dependencies of Go modules. When installing or updating dependencies, minimal version selection always selects the minimal (oldest) module version that satisfies the overall requirements of a build.

Minimal version selection has a lot of nice properties:

  • It is simple to understand and implement.
  • It is fast to compute because it avoids solving Version-SAT, which is NP-Complete.
  • It produces high-fidelity builds by default because the dependencies that a user builds are as close as possible to the ones the author developed against.

Many other popular dependency managers, like NPM, prefer to install the latest version of dependencies by default. I will share a recent real-world example of the kind of pain that minimal version selection prevents.

Background

Sourcegraph recently open sourced our browser extensions, but before we could, I needed to remove a transitive dependency on our private icons NPM package that contains licensed assets.

Before:

github.com/sourcegraph/browser-extensions (open source)
└─┬ @sourcegraph/[email protected] (open source)
  └── @sourcegraph/[email protected] (private)

Codeintellify provides hovertooltips on code. Both our web app and our browser extensions depend on it.

The task of removing codeintellify's dependency on our private icons repository was straightforward because it was only using a single loading spinner. All I needed to do was create a NPM package for our open source loading spinner, update codeintellify to use the new package, and finally update the browser extension to depend on the lastest version of codeintellify: 3.6.0

After:

github.com/sourcegraph/browser-extensions (open source)
└─┬ @sourcegraph/[email protected] (open source)
  └── @sourcegraph/[email protected] (open source)

Everything worked; mission accomplished!

Pain

Renovate Bot noticed that there was a new version of codeintellify so it helpfully created a pull request to update our main repository to codeintellify 3.6.0.

diff

After CI passed, another engineer merged the pull request. Soon thereafter, our end-to-end tests (that only run on master after deploying to our staging environment) started failing because hover tooltips, a core feature of Sourcegraph, were broken.

error

Fortunately, the end-to-end test failure blocked this from being deployed to sourcegraph.com.

Since I was asleep at the time, and the breakage was apparently caused by codeintelify 3.6.0 (and transitively, me), my teammate just reverted the commit.

When I arrived to work the next day, I was confused. What could have possibly gone wrong?

  • Everything had worked fine when I integrated this library in our browser extensions.
  • There is not much opportunity to subtly break the site with 18 lines of CSS and 10 lines of TypeScript.
  • The error message didn't make any sense, and wasn't related to anything I had done!

After scratching my head for a bit, I went back to look at the diff and realized that GitHub had helpfully auto-collapsed package-lock.json. Of course I don't need to see that; it is for machines, right?

I expanded the diff and my heart sank.

package-lock

NPM had "helpfully" seen that there was a new patch release of sanitize-html and upgraded to the new version even though sanitize-html had no relationship to the package that Renovate was updating.

Worse, that "patch" version of sanitize-html was completely broken (#241, #242).

Reflection

I was upset. Not with sanitize-html though; people make mistakes. In fact, I assume that people will make mistakes (in this case the mistake was quickly fixed in 1.18.4, and I was able to move on with my life).

I was upset because the design of NPM assumes things that just aren't true in general:

  • People don't make mistakes.
  • People perfectly understand and apply semantic versioning (see also: compatible versioning).
  • Having the ecosystem converge to the latest dependency versions as fast as possible is more important that the stability of individual projects.

To the contrary, I expect NPM and tools like it to:

  • Assume that people (including myself) will make mistakes and do as much as possible to guard against those mistakes.
  • Optimize for working dependencies (i.e. the ones I am already using and have tested), not the latest dependencies.
  • Not update a dependency unless I explicitly ask it to (#1156, #2348).
  • Make lockfiles less cumbersome. Lockfiles should:

This experience was the most benign manifestation of this problem. It would have been worse if any or all of the following were true:

  • My change was more complicated.
  • The aggregate diff between codeintellify 3.5.3 and 3.6.0 included other changes that weren't mine.
  • The aggregate diff between sanitize-html 1.18.2 and 1.18.3 was larger.
  • sanitize-html didn't quickly release a patch to fix the issue.
  • sanitize-html had transitive dependencies that were also updated.
  • NPM had updated more than one unrelated package.
  • Tests didn't catch the regression and my change was deployed to production.

I am happy that this class of problem won't exist with Go modules, and I hope that other package managers evolve to solve these problems.

Thanks

Special thanks to Felix (@felixfbecker) for ensuring that end-to-end tests must pass on staging before deploying to sourcegraph.com, discussing minimal version selection with me, authoring a tool to create new NPM packages with the right boilerplate, searching for workarounds to get the behavior I want out of npm, and ultimately filing detailed issues to drive change in the ecosystem (#1156, #2348).