package.json

Avoiding Dependencies Reinstall When Building NodeJS Docker Image

When I first started writing Dockerfile for building a NodeJS app, I will do:

1
2
3
WORKDIR /app
COPY . /app
RUN npm install

Sure, this is very simple, but every time a change in the source file, the entire dependency tree needs to be re-installed. The only time you need to rebuild is when package.json changes.

One trick is to move package.json elsewhere, and build the dependencies, then move back:

1
2
3
4
5
WORKDIR /app
ADD package.json /tmp
RUN cd /tmp && npm install
COPY . /app
RUN mv /tmp/node_modules /app/node_modules

However, if there are a lot of depending packages with lots of files, the last step will take a long time.

How about using symbolic link to shorten the time?

1
2
3
4
5
WORKDIR /app
ADD package.json /tmp
RUN cd /tmp && npm install
COPY . /app
RUN ln -s /tmp/node_modules /app/node_modules

Well, it works, but not every NPM package is happy about it. Some will complain about the softlink, result in errors.

So, how can we reuse caches as much as possible, and:

  1. Do not need to reinstall dependencies
  2. Do not use symbolic link

Here is a solution, just move down the COPY statement:

1
2
3
4
5
6
7
8
9
10
11
12
13
# Set the working directory, which creates the directory.
WORKDIR /app
# Install dependencies.
ADD package.json /tmp
RUN cd /tmp && npm install
# Move the dependency directory back to the app.
RUN mv /tmp/node_modules /app
# Copy the content into the app directory. The previously added "node_modules"
# directory will not be overridden.
COPY . /app

With this configuration, if package.json stays the same, only the last cache layer will be rebuilt when source codes change.

Why Using Exact Versions in NPM Package Dependencies

Versions frequently used in NPM package dependencies are caret ranges or ^version. And npm install uses caret ranges as well.

Caret range:

Allows changes that do not modify the left-most non-zero digit in the [major, minor, patch] tuple. In other words, this allows patch and minor updates for versions 1.0.0 and above, patch updates for version 0.x >=0.1.0, and no updates for version 0.0.x.

Most of time, using caret ranges as dependency versions work perfectly fine. But once a while, there are bugs. In one of projects, JS-YAML with caret version ^3.4.3 was used and installed:

1
2
3
4
5
{
"dependencies": {
"js-yaml": "^3.4.3"
}
}

Time passes. When npm install was run again from a freshly cloned project code base, version 3.5.2 was installed. Since JS-YAML version 3.5.0, errors are thrown when safe loading a spec with duplicate keys. If there are no duplicate keys in the YAML file, this will be fine. However, one of the files has it. The correct approach is to actually fix the duplicate keys. But it requires additional work back and forth. You’ve heard this before: “Hey it works when I installed it.”

The point here is to use exact versions instead of letting the package manager decides by dropping the caret character:

1
2
3
4
5
{
"dependencies": {
"js-yaml": "3.4.3"
}
}

This will avoid the above mentioned problem. We can manually update the outdated NPM packages.

Don’t let machine decide. You Do!